Dipeptidyl peptidases

ABSTRACT

Peptides which comprise sequences as shown in Seq ID NO:2 or HisGlyTrpSerTypGlyGlyPheLeu; LeuAspGluAsnValHisPhePhe; GluArgHisSerIleArg and PheValIleGlnGluGluPhe which show peptidase ability and have substrate specificity for at least one of the compounds H-Ala-Pro-pNA, H-Gly-Pro-pNA, H-Gly-Pro-pNA ans H-Arg-Pro-pNA, peptides having sequence ID No:7 are also claimed. Nucleic acids, vectors, antibodies and hybridoma cells are also claimed with reference to the above sequences and there abilities.

FIELD OF INVENTION

[0001] The invention relates to a dipeptidyl peptidase, to a nucleic acid molecule which encodes it, and to uses of the peptidase.

BACKGROUND OF THE INVENTION

[0002] The dipeptidyl peptidase (DPP) IV-like gene family is a family of molecules which have related protein structure and function [1-3]. The gene family includes the following molecules: DPPIV (CD26), dipeptidyl amino-peptidase-like protein 6 (DPP6), dipeptidyl amino-peptidase-like protein 8 (DPP8) and fibroblast activation protein (FAP) [1,2,4,5]. Another possible member is DPPIV-β[6].

[0003] The molecules of the DPPIV-like gene family are serine proteases, they are members of the peptidase family S9b, and together with prolyl endopeptidase (S9a) and acylaminoacyl peptidase (S9c), they are comprised in the prolyl oligopeptidase family[5,7].

[0004] DPPIV and FAP both have similar postproline dipeptidyl amino peptidase activity, however, unlike DPPIV, FAP also has gelatinase activity[8,9].

[0005] DPPIV substrates include chemokines such as RANTES, eotaxin, macrophage-derived chemokine and stromal-cell-derived factor 1; growth factors such as glucagon and glucagon-like peptides 1 and 2; neuropeptides including neuropeptide Y and substance P; and vasoactive peptides[10-12].

[0006] DPPIV and FAP also have non-catalytic activity; DPPIV binds adenosine deaminase, and FAP binds to α₃β₁, and α₅β₁ integrin[13-14].

[0007] In view of the above activities, the DPPIV-like family members are likely to have roles in intestinal and renal handling of proline containing peptides, cell adhesion, peptide metabolism, including metabolism of cytokines, neuropeptides, growth factors and chemokines, and immunological processes, specifically T cell stimulation[3,11,12].

[0008] Consequently, the DPPIV-like family members are likely to be involved in the pathology of disease, including for example, tumour growth and biology, type II diabetes, cirrhosis, autoimmunity, graft rejection and HIV infection[3,15-18].

[0009] Inhibitors of DPPIV have been shown to suppress arthritis, and to prolong cardiac allograft survival in animal models in vivo[19,20]. Some DPPIV inhibitors are reported to inhibit HIV infection[21]. It is anticipated that DPPIV inhibitors will be useful in other therapeutic applications including treating diarrhoea, growth hormone deficiency, lowering glucose levels in non insulin dependent diabetes mellitus and other disorders involving glucose intolerance, enhancing mucosal regeneration and as immunosuppressants[3,21-24].

[0010] There is a need to identify members of the DPPIV-like gene family as this will allow the identification of inhibitor(s) with specificity for particular family member(s), which can then be administered for the purpose of treatment of disease. Alternatively, the identified member may of itself be useful for the treatment of disease.

SUMMARY OF THE INVENTION

[0011] The present invention seeks to address the above identified need and in a first aspect provides a peptide which comprises the amino acid sequence shown in SEQ ID NO:2. As described herein, the inventors believe that the peptide is a prolyl oligopeptidase and a dipeptidyl peptidase, because it has substantial and significant homology with the amino acid sequences of DPPIV and DPP8. As homology is observed between DPP8, DPPIV and DPP9, it will be understood that DPP9 has a substrate specificity for at least one of the following compounds: H-Ala-Pro-pNA, H-Gly-Pro-pNA and H-Arg-Pro-pNA.

[0012] The peptide is homologous with human DPPIV and DPP8, and importantly, identity between the sequences of DPPIV and DPP8 and SEQ ID NO: 2 is observed at the regions of DPPIV and DPP8 containing the catalytic triad residues and the two glutamate residues of the β-propeller domain essential for DPPIV enzyme activity. The observation of amino acid sequence homology means that the peptide which has the amino acid sequence shown in SEQ ID NO:2 is a member of the DPPIV-like gene family. Accordingly the peptide is now named and described herein as DPP9.

[0013] The following sequences of the human DPPIV amino acid sequence are important for the catalytic activity of DPPIV: (i) Trp⁶¹⁷GlyTrpSerTyrGlyGlyTyrVal; (ii) Ala⁷⁰⁷AspAspAsnValH is Phe; (iii) Glu⁷³⁸AspHisGlylleAlaSer; and (iv) Trp²⁰¹ValTyrGluGluGluVal [25-28]. As described herein, the alignment of the following sequences of DPP9: His⁸³³GlyTrpSerTyrGlyGlyPheLeu; Leu⁹¹³AspGluAsnValHisPhePhe; Glu⁹⁴⁴ArgHisSerIleArg and Phe³⁵⁰ValIleGlnGluGluPhe with sequences (i) to (iv) above, respectively, suggests that these sequences of DPP9 are likely to confer the catalytic activity of DPP9. This is also supported by the alignment of DPP9 and DPP8 amino acid sequences. More specifically, DPP8 has substrate specificity for H-Ala-Pro-pNA, H-Gly-Pro-pNA and H-Arg-Pro-pNA, and shares near identity, with only one position of amino acid difference, in each of the above described sequences of DPP9. Thus, in a second aspect, the invention provides a peptide comprising the following amino acid sequences: HisGlyTrpSerTyrGlyGlyPheLeu; LeuAspGluAsnValHisPhePhe; GluArgHisSerIleArg and PheValIleGlnGluGluPhe; which has the substrate specificity of the sequence shown in SEQ ID NO:2.

[0014] Also described herein, using the GAP sequence alignment algorithm, it is observed that DPP9 has 53% amino acid similarity and 29% amino acid identity with a C. elegans protein. Further, as shown herein, a nucleic acid molecule which encodes DPP9, is capable of hybridising specifically with DPP9 sequences derived from non-human species, including rat and mouse. Further, the inventors have isolated and characterised a mouse homologue of human DPP9. Together these data demonstrate that DPP9 is expressed in non-human species. Thus in a third aspect, the invention provides a peptide which has at least 91% amino acid identity with the amino acid sequence shown in SEQ ID NO:2, and which has the substrate specificity of the sequence shown in SEQ ID NO:2. Typically the peptide has the sequence shown in SEQ ID NO:4. Preferably, the amino acid identity is 75%. More preferably, the amino acid identity is 95%. Amino acid identity is calculated using GAP software [GCG Version 8, Genetics Computer Group, Madison, Wis., USA] as described further herein. Typically, the peptide comprises the following sequences: HisGlyTrpSerTyrGlyGlyPheLeu; LeuAspGluAsnValHisPhePhe; GluArgHisSerIleArg and PheValIleGlnGluGluPhe.

[0015] In view of the homology between DPPIV, DPP8 and DPP9 amino acid sequences, it is expected that these sequences will have similar tertiary structure. This means that the tertiary structure of DPP9 is likely to include the seven-blade β-propeller domain and the α/β hydrolase domain of DPPIV. These structures in DPP9 are likely to be conferred by the regions comprising β-propeller, Val²²⁶ to Ala⁷⁰⁵, α/β hydrolase, Ser⁷⁰⁶ to Leu⁹⁶⁹ and about 70 to 90 residues in the region Ser¹³⁶ to Gly²²⁵. As it is known that the β-propeller domain regulates proteolysis mediated by the catalytic triad in the α/β hydrolase domain of prolyl oligopeptidase, [29] it is expected that truncated forms of DPP9 can be produced, which have the substrate specificity of the sequence shown in SEQ ID NO:2, comprising the regions referred to above (His⁸³³GlyTrpSerTyrGlyGlyPheLeu; Leu⁹¹³AspGluAsnValH is PhePhe; Glu⁹⁴⁴ArgHisSerIleArg and Phe³⁵⁰ValIleGlnGluGluPhe) which confer the catalytic specificity of DPP9. Examples of truncated forms of DPP9 which might be prepared are those in which the region conferring the β-propeller domain and the α/β hydrolase domain are spliced together. Other examples of truncated forms include those that are encoded by splice variants of DPP9 mRNA. Thus although, as described herein, the biochemical characterisation of DPP9 shows that DPP9 consists of 969 amino acids and has a molecular weight of about 110 kDa, it is recognised that truncated forms of DPP9 which have the substrate specificity of the sequence shown in SEQ ID NO:2, may be prepared using standard techniques [30,31]. Thus in a fourth aspect, the invention provides a fragment of the sequence shown in SEQ ID NO: 2, which has the substrate specificity of the sequence shown in SEQ ID NO:2. The inventors believe that a fragment from Ser136 to Leu969 (numbered according to SEQ ID NO:2) would have enzyme activity.

[0016] It is recognised that DPP9 may be fused, or in other words, linked to a further amino acid sequence, to form a fusion protein which has the substrate specificity of the sequence shown in SEQ ID NO:2. An example of a fusion protein is one which comprises the sequence shown in SEQ ID NO:2 which is linked to a further amino acid sequence: a “tag” sequence which consists of an amino acid sequence encoding the V5 epitope and a His tag. An example of another further amino acid sequence which may be linked with DPP9 is a glutathione S transferase (GST) domain [30]. Another example of a further amino acid sequence is a portion of CD8α [8]. Thus in one aspect, the invention provides a fusion protein comprising the amino acid sequence shown in SEQ ID NO:2 linked with a further amino acid sequence, the fusion protein having the substrate specificity of the sequence shown in SEQ ID NO:2.

[0017] It is also recognised that the peptide of the first aspect of the invention may be comprised in a polypeptide, so that the polypeptide has the substrate specificity of DPP9. The polypeptide may be useful, for example, for altering the protease susceptibility of DPP9, when used in in vivo applications. An example of a polypeptide which may be useful in this regard, is albumin. Thus in another embodiment, the peptide of the first aspect is comprised in a polypeptide which has the substrate specificity of DPP9.

[0018] In one aspect, the invention provides a peptide which includes the amino acid sequence shown in SEQ ID NO:7. In one embodiment the peptide consists of the amino acid sequence shown in SEQ ID NO:7.

[0019] As described further herein, the amino acid sequence shown in SEQ ID NO:7, and the amino acid sequences of DPPIV, DPP8 and FAP are homologous. DPPIV, DPP8 and FAP have dipeptidyl peptidase enzymatic activity and have substrate specificity for peptides which contain the di-peptide sequence, Ala-Pro. The inventors note that the amino acid sequence shown in SEQ ID NO:7 contains the catalytic triad, Ser-Asp-His. Accordingly, it is anticipated that the amino acid sequence shown in SEQ ID NO:7 has enzymatic activity in being capable of cleaving a peptide which contains Ala-Pro by hydrolysis of a peptide bond located C-terminal adjacent to proline in the di-peptide sequence.

[0020] In one embodiment, the peptide comprises an amino acid sequence shown in SEQ ID NO:7 which is capable of cleaving a peptide bond which is C-terminal adjacent to proline in the sequence Ala-Pro. The capacity of a dipeptidyl peptidase to cleave a peptide bond which is C-terminal adjacent to proline in the di-peptide sequence Ala-Pro can be determined by standard techniques, for example, by observing hydrolysis of a peptide bond which is C-terminal adjacent to proline in the molecule Ala-Pro-p-nitroanilide.

[0021] The inventors recognise that by using standard techniques it is possible to generate a peptide which is a truncated form of the sequence shown in SEQ ID NO:7, which retains the proposed enzymatic activity described above. An example of a truncated form of the amino acid sequence shown in SEQ ID NO:7 which retains the proposed enzymatic activity is a form which includes the catalytic triad, Ser-Asp-His. Thus a truncated form may consist of less than the 831 amino acids shown in SEQ ID NO:7. Accordingly, in a further embodiment, the peptide is a truncated form of the peptide shown in SEQ ID NO:7, which is capable of cleaving a peptide bond which is C-terminal adjacent to proline in the sequence Ala-Pro.

[0022] It will be understood that the amino acid sequence shown in SEQ ID NO:7 may be altered by one or more amino acid deletions, substitutions or insertions of that amino acid sequence and yet retain the proposed enzymatic activity described above. It is expected that a peptide which is at least 47% similar to the amino acid sequence of SEQ ID NO:7, or which is at least 27% identical to the amino acid sequence of SEQ ID NO:7, will retain the proposed enzymatic activity described above. The % similarity can be determined by use of the program/algorithm “GAP” which is available from Genetics Computer Group (GCG), Wisconsin. Thus in another embodiment of the first aspect, the peptide has an amino acid sequence which is at least 47% similar to the amino acid sequence shown in SEQ ID NO:7, and is capable of cleaving a peptide bond which is C-terminal adjacent to proline in the sequence Ala-Pro.

[0023] As described above, the isolation and characterisation of DPP9 is necessary for identifying inhibitors of DPP9 catalytic activity, which may be useful for the treatment of disease. Accordingly, in a fifth aspect, the invention provides a method of identifying a molecule capable of inhibiting cleavage of a substrate by DPP9, the method comprising the following steps:

[0024] (a) contacting DPP9 with the molecule;

[0025] (b) contacting DPP9 of step (a) with a substrate capable of being cleaved by DPP9, in conditions sufficient for cleavage of the substrate by DPP9; and

[0026] (c) detecting substrate not cleaved by DPP9, to identify that the molecule is capable of inhibiting cleavage of the substrate by DPP9.

[0027] It is recognised that although inhibitors of DPP9 may also inhibit DPPIV and other serine proteases, as described herein, the alignment of the DPP9 amino acid sequence with most closely related molecules, (i.e. DPPIV), reveals that the DPP9 amino acid is distinctive, particularly at the regions controlling substrate specificity. Accordingly, it is expected that it will be possible to identify inhibitors which inhibit DPP9 catalytic activity specifically, which do not inhibit catalytic activity of DPPIV-like gene family members, or other serine proteases. Thus, in a sixth aspect, the invention provides a method of identifying a molecule capable of inhibiting specifically, the cleavage of a substrate by DPP9, the method comprising the following steps:

[0028] (a) contacting DPP9 and a further protease with the molecule;

[0029] (b) contacting DPP9 and the further protease of step (a) with a substrate capable of being cleaved by DPP9 and the further protease, in conditions sufficient for cleavage of the substrate by DPP9 and the further protease; and

[0030] (c) detecting substrate not cleaved by DPP9, but cleaved by the further protease, to identify that the molecule is capable of inhibiting specifically, the cleavage of the substrate by DPP9.

[0031] In a seventh aspect, the invention provides a method of reducing or inhibiting the catalytic activity of DPP9, the method comprising the step of contacting DPP9 with an inhibitor of DPP9 catalytic activity. In view of the homology between DPP9 and DPP8 amino acid sequences, it will be understood that inhibitors of DPPB activity may be useful for inhibiting DPP9 catalytic activity. Examples of inhibitors suitable for use in the seventh aspect are described in [21,32,33]. Other inhibitors useful for inhibiting DPP9 catalytic activity can be identified by the methods of the fifth or sixth aspects of the invention.

[0032] In one embodiment, the catalytic activity of DPP9 is reduced or inhibited in a mammal by administering the inhibitor of DPP9 catalytic activity to the mammal. It is recognised that these inhibitors have been used to reduce or inhibit DPPIV catalytic activity in vivo, and therefore, may also be used for inhibiting DPP9 catalytic activity in vivo. Examples of inhibitors useful for this purpose are disclosed in the following [21,32-34].

[0033] Preferably, the catalytic activity of DPP9 in a mammal is reduced or inhibited in the mammal, for the purpose of treating a disease in the mammal. Diseases which are likely to be treated by an inhibitor of DPP9 catalytic activity are those in which DPPIV-like gene family members are associated [3,10,11,17,21,36], including for example, neoplasia, type II diabetes, cirrhosis, autoimmunity, graft rejection and HIV infection.

[0034] Preferably, the inhibitor for use in the seventh aspect of the invention is one which inhibits the cleavage of a peptide bond C-terminal adjacent to proline. As described herein, examples of these inhibitors are 4-(2-aminoethyl)benzenesulfonylfluoride, aprotinin, benzamidine/HCl, Ala-Pro-Gly, H-Lys-Pro-OH HCl salt and zinc ions, for example, zinc sulfate or zinc chloride. More preferably, the inhibitor is one which specifically inhibits DPP9 catalytic activity, and which does not inhibit the catalytic activity of other serine proteases, including, for example DPPIV, DPP8 or FAP.

[0035] In an eighth aspect, the invention provides a method of cleaving a substrate which comprises contacting the substrate with DPP9 in conditions sufficient for cleavage of the substrate by DPP9, to cleave the substrate. Examples of molecules which can be cleaved by the method are H-Ala-Pro-pNA, H-Gly-Pro-pNA and H-Arg-Pro-pNA. Molecules which are cleaved by DPPIV including RANTES, eotaxin, macrophage-derived chemokine, stromal-cell-derived factor 1, glucagon and glucagon-like peptides 1 and 2, neuropeptide Y, substance P and vasoactive peptide are also likely to be cleaved by DPP9 [11,12]. In one embodiment, the substrate is cleaved by cleaving a peptide bond C-terminal adjacent to proline in the substrate. The molecules cleaved by DPP9 may have Ala, or Trp, Ser, Gly, Val or Leu in the P1 position, in place of Pro [11,12].

[0036] The inventors have characterised the sequence of a nucleic acid molecule which encodes the amino acid sequence shown in SEQ ID NO:2. Thus in a tenth aspect, the invention provides a nucleic acid molecule which encodes the amino acid sequence shown in SEQ ID NO:2.

[0037] In an eleventh aspect, the invention provides a nucleic acid molecule which consists of the sequence shown in SEQ ID NO:1.

[0038] In another aspect, the invention provides a nucleic acid molecule which encodes a peptide comprising the amino acid sequence shown in SEQ ID NO:7.

[0039] The inventors have characterised the nucleotide sequence of the nucleic acid molecule encoding SEQ ID NO:7. The nucleotide sequence of the nucleic acid molecule encoding DPP4-like-2 is shown in SEQ ID NO:8. Thus, in one embodiment, the nucleic acid molecule comprises the nucleotide sequence shown in SEQ ID NO:8. In another embodiment, the nucleic acid molecule consists of the nucleotide sequence shown in SEQ ID NO:8.

[0040] The inventors recognise that a nucleic acid molecule which has the nucleotide sequence shown in SEQ ID NO:8 could be made by producing only the fragment of the nucleotide sequence which is translated. Thus in an embodiment, the nucleic acid molecule does not contain 5′ or 3′ untranslated nucleotide sequences.

[0041] As described herein, the inventors observed RNA of 4.4 kb and aminor band of 4.8 kb in length which hybridised to a nucleic acid molecule comprising sequence shown in SEQ ID NO:8. It is possible that these mRNA species are splice variants. Thus in another embodiment, the nucleic acid molecule comprises the nucleotide sequence shown in SEQ ID NO:8 and which is approximately 4.4 kb or 4.8 kb in length.

[0042] In another embodiment, the nucleic acid molecule is selected from the group of nucleic acid molecules consisting of DPP4-like-2a, DPP4-like-2b and DPP4-like-2c, as shown in FIG. 2.

[0043] In another aspect, the invention provides a nucleic acid molecule having a sequence shown in SEQ ID NO: 3.

[0044] In a twelfth aspect, the invention provides a nucleic acid molecule which is capable of hybridising to a nucleic acid molecule consisting of the sequence shown in SEQ ID NO:1 in stringent conditions, and which encodes a peptide which has the substrate specificity of the sequence shown in SEQ ID NO:2. As shown in the Northern blot analysis described herein, DPP9 mRNA hybridises specifically to the sequence shown in SEQ ID NO:1, after washing in 2×SSC/1.0% SDS at 37° C., or after washing in 0.1×SSC/0.1% SDS at 50° C. “Stringent conditions” are conditions in which the nucleic acid molecule is exposed to 2×SSC/1.0% SDS. Preferably, the nucleic acid molecule is capable of hybridising to a molecule consisting of the sequence shown in SEQ ID NO:1 in high stringent conditions. “High stringent conditions” are conditions in which the nucleic acid molecule is exposed to 0.1×SSC/0.1% SDS at 50° C.

[0045] As described herein, the inventors believe that the gene which encodes DPP9 is located at band p13.3 on human chromosome 19. The location of the DPP9 gene is distinguished from genes encoding other prolyl oligopeptidases, which are located on chromosome 2, at bands 2q24.3 and 2q23, chromosome 7 or chromosome 15q22. Thus in an embodiment, the nucleic acid molecule is one capable of hybridising to a gene which is located at band p13.3 on human chromosome 19.

[0046] It is recognised that a nucleic acid molecule which encodes the amino acid sequence shown in SEQ ID NO:2, or which comprises the sequence shown in SEQ ID NO:1, could be made by producing the fragment of the sequence which is translated, using standard techniques [30,31]. Thus in an embodiment, the nucleic acid molecule does not contain 5′ or 3′ untranslated sequences.

[0047] In a thirteenth aspect, the invention provides a vector which comprises a nucleic acid molecule of the tenth aspect of the invention. In one embodiment, the vector is capable of replication in a COS-7 cell, CHO cell or 293T cell, or E. coli. In another embodiment, the vector is selected from the group consisting of % TripleEx, λTripleEx, pGEM-T Easy Vector, pSecTag2Hygro, pet15b, pEE14.HCMV.gs and pcDNA3.1/VS/His.

[0048] In a fourteenth aspect, the invention provides a cell which comprises a vector of the thirteenth aspect of the invention. In one embodiment, the cell is an E. coli cell. Preferably, the E. coli is MC1061, DH5α, JM109, BL21DE3, pLysS. In another embodiment, the cell is a COS-7, COS-1, 293T or CHO cell.

[0049] In a fifteenth aspect, the invention provides a method for making a peptide of the first aspect of the invention comprising, maintaining a cell according to the fourteenth aspect of the invention in conditions sufficient for expression of the peptide by the cell. The conditions sufficient for expression are described herein. In one embodiment, the method comprises the further step of isolating the peptide.

[0050] In a sixteenth aspect, the invention provides a peptide when produced by the method of the fifteenth aspect.

[0051] In a seventeenth aspect, the invention provides a composition comprising a peptide of the first aspect and a pharmaceutically acceptable carrier.

[0052] In an eighteenth aspect, the invention provides an antibody which is capable of binding a peptide according to the first aspect of the invention. The antibody can be prepared by immunising a subject with purified DPP9 or a fragment thereof according to standard techniques [35]. An antibody may be prepared by immunising with transiently transfected DPP9⁺ cells. It is recognised that the antibody is useful for inhibiting activity of DPP9. In one embodiment, the antibody of the eighteenth aspect of the invention is produced by a hybridoma cell.

[0053] In a nineteenth aspect, the invention provides a hybridoma cell which secretes an antibody of the nineteenth aspect.

BRIEF DESCRIPTION OF THE FIGURES

[0054]FIG. 1. Nucleotide sequence of DPP8 (SEQ ID NO:5).

[0055]FIG. 2. Schematic representation of the cloning of human cDNA DPP9.

[0056]FIG. 3. Schematic representation of the assembly of nucleotide sequences of human cDNA DPP9.

[0057]FIG. 4. Nucleotide sequence of human cDNA DPP9 (SEQ ID NO:1) and amino acid sequence of human DPP9 (SEQ ID NO:2).

[0058]FIG. 5. Alignment of human DPP9 amino acid sequences with the amino acid sequence encoded by a predicted open reading frame of GDD.

[0059]FIG. 6. Alignment of human DPP8, DPP9, DPP4 and FAP amino acid sequences.

[0060]FIG. 7. Northern blot analysis of human DPP9 RNA.

[0061]FIG. 8. Alignment of murine (SEQ ID NO:4) and human DPP9 amino acid sequences.

[0062]FIG. 9. Alignment of murine (SEQ ID NO:3) and human DPP9 cDNA nucleotide sequences.

[0063]FIG. 10. Northern blot analysis of rat DPP9 RNA.

[0064]FIG. 11. Detection of DPP9 cDNA in CEM cells.

[0065]FIG. 12. Detection of murine DPP9 nucleotide sequence.

DETAILED DESCRIPTION OF THE INVENTION EXAMPLES

[0066] General

[0067] Restriction enzymes and other enzymes used in cloning were obtained from Boehringer Mannheim Roche. Standard molecular biology techniques were used unless indicated otherwise.

[0068] DPP9 Cloning

[0069] The nucleotide sequence of DPP8 shown in FIG. 1 was used to search the GenBank database for homologous nucleotide sequences. Nucleotide sequences referenced by GenBank accession numbers AC005594 and AC005783 were detected and named GDD. The GDD nucleotide sequence is 39.5 kb and has 19 predicted exons. The analysis of the predicted exon-intron boundaries in GDD suggests that the predicted open reading frame of GDD is 3.6 kb in length.

[0070] In view of the homology of DPP8 and the GDD nucleotide sequences, we hypothesised the existence of DPPIV-like molecules other than DPP8. We used oligonucleotide primers derived from the nucleotide sequence of GDD and reverse transcription PCR (RT-PCR) to isolate a cDNA encoding DPPIV-like molecules.

[0071] RT-PCR amplification of human liver RNA derived from a pool of 4 patients with autoimmune hepatitis using the primers GDD pr 1F and GDD pr 1R (Table 1) produced a 500 base pair product. This suggested that DPPIV-like molecules are likely to be expressed in liver cells derived from individuals with autoimmune hepatitis and that RNA derived from these cells is likely to be a suitable source for isolating cDNA clones encoding DPPIV-like molecules.

[0072] Primers GDD pr 3F and GDD pr 1R (Table 1) were then used to isolate a cDNA clone encoding a DPP4-like molecule. A 1.6 kb fragment was observed named DPP4-like-2a. Primers GDD pr 15F and GDD pr 7R (Table 1) were then used to isolate a cDNA clone encoding a DPP4-like molecule. A 1.9 kb product was observed and named DPP4-like-2b. As described further herein, the sequence of DPP4-like-2b overlaps with the sequence of DPP4-like-2a.

[0073] The DPP4-like-2a and 2b fragments were gel purified using WIZARD® PCR preps kit and cloned into the pGEM®-T-easy plasmid vector using the EcoRI restriction sites. The ligation reaction was used to transform JM109 competent cells. The plasmid DNA was prepared by miniprep. The inserts were released by EcoRI restriction digestion. The DNA was sequenced in both directions using the M13Forward and M13Reverse sequencing primers. The complete sequence of DPP4-like-2a and 2b fragments was derived by primer walking.

[0074] The nucleotide sequence 5′ adjacent to DPP4-like-2b was obtained by 5′RACE using dC tailing and the gene specific primers GDD GSP1.1 and 2.1 (Table 1). A fragment of 500 base pairs (DPP4-like-2c) was observed. The fragment was gel purified using WIZARD® PCR preps kit and cloned into the pGEM®-T-easy plasmid vector using the EcoRI restriction sites. The ligation reaction was used to transform JM109 competent cells. The plasmid DNA was prepared by miniprep. The inserts were released by EcoRI restriction digestion. The DNA was sequenced in both directions using the M13Forward and M13Reverse sequencing primers.

[0075] We identified further sequences, BE727051 and BE244612, with identity to the 5′ end of DPP9. These were discovered while performing BLASTn with the 5′ end of the DPP9 nucleotide sequence. BE727051 contained further 5′ sequence for DPP9, which was also present in the genomic sequence for DPP9 on chromosome 19p13.3. This was used to design primer DPP9-22F (5′GCCGGCGGGTCCCCTGTGTCCG3′). Primer 22F was used in conjunction with primer GDD3′end (5′GGGCGGGACAAAGTGC CTCACTGG3′) on cDNA made from the human CEM cell line to produce a 3000 bp product as expected FIG. 11.

[0076] Nucleotide Sequence Analysis of DPP4-like-2a, 2b, and 2c Fragments.

[0077] An analysis of the nucleotide sequence of fragments DPP4-like 2a, 2b and 2c with the Sequencher™ version 3.0 computer program (FIG. 3), and the 5′ fragment isolated by primers DPP9-22F and GDD3′end, revealed the nucleotide sequence shown in FIG. 4.

[0078] The predicted amino acid sequence shown in FIG. 4 was compared to a predicted amino acid sequence encoded by a predicted open reading frame of GDD (predicted from the nucleotide sequence referenced by GenBank Accession Nos. AC005594 and AC005783), to determine the relatedness of the nucleotide sequence of FIG. 4 to the nucleotide sequence of the predicted open reading frame of GDD (FIG. 5). Regions of amino acid identity were observed suggesting that there may be regions of nucleotide sequence identity of the predicted open reading frame of GDD and the sequence of FIG. 4. However, as noted in FIG. 5, there are regions of amino acid sequence encoded by the sequence of FIG. 4 and the amino acid sequence encoded by the predicted open reading frame of GDD which are not identical, demonstrating that the nucleotide sequences encoding the predicted open reading frame of GDD and the sequence shown in FIG. 4 are different nucleotide sequences.

[0079] As described further herein, the predicted amino acid sequence encoded by the cDNA sequence shown in FIG. 4 is homologous to the amino acid sequence of DPP8 (FIG. 6). Accordingly, and as a cDNA consisting of the nucleotide sequence shown in FIG. 4 was not known, the sequence shown in FIG. 4 was named cDNA DPP9.

[0080] The predicted amino acid sequence encoded by cDNA DPP9 (called DPP9) is 969 amino acids and is shown in FIG. 4. The alignment of DPP9 and DPP8 amino acid sequences suggests that the nucleotide sequence shown in FIG. 4 may be a partial length clone. Notwithstanding this point, as discussed below, the inventors have found that the alignment of DPP9 amino acid sequence with the amino acid sequences of DPP8, DPP4 and FAP shows that DPP9 comprises sequence necessary for providing enzymolysis and utility. In view of the similarity between DPP9 and DPP8, a full length clone may be of the order of 882 amino acids. A full length clone could be obtained by standard techniques, including for example, the RACE technique using an oligonucleotide primer derived from the 5′ end of cDNA DPP9.

[0081] In view of the homology between the DPP8 and DPP9 amino acid sequences, it is likely that cDNA DPP9 encodes an amino acid sequence which has dipeptidyl peptidase enzymatic activity. Specifically, it is noted that the DPP9 amino acid sequence contains the catalytic triad Ser-Asp-His in the order of a non-classical serine protease as required for the charge relay system. The serine recognition site characteristic of DPP4 and DPP4-like family members, GYSWGG, surrounds the serine residue also suggesting that DPP9 cDNA will encode a DPP4-like enzyme activity.

[0082] Further, DPP9 amino acid sequence also contains the two glutamic acid residues located at positions 205 and 206 in DPPIV. These are believed to be essential for the dipeptidyl peptidase enzymatic activity. By sequence alignment with DPPIV, the residues in DPP8 predicted to play a pivotal role in the pore opening mechanism in Blade 2 of the propeller are E²⁵⁹, E²⁶⁰. These are equivalent to the residues Glu²⁰⁵ and Glu²⁰⁶ in DPPIV which previously have been shown to be essential for DPPIV enzyme activity. A point mutation Glu259Lys was made in DPP8 cDNA using the Quick Change Site directed Mutagenesis Kit (Stratagene, La Jolla). COS-7 cells transfected with wildtype DPP8 cDNA stained positive for H-Ala-Pro4 MbNA enzyme activity while the mutant cDNA gave no staining. Expression of DPP8 protein was demonstrated in COS cells transfected with wildtype and mutant cDNAs by immunostaining with anti-VS mAB. This mAB detects the V5 epitope that has been tagged to the C-terminus of DPP8 protein. Point mutations were made to each of the catalytic residues of DPP8, Ser739A, Asp817Ala and His849Ala, and each of these residues were also determined to be essential for DPP8 enzyme activity. In summary, the residues that have been shown experimentally to be required for enzyme activity in DPPIV and DPP8 are present in the DPP9 amino acid sequence: Glu³⁵⁴, Glu³⁵⁵, Ser¹³⁶, Asp⁹¹⁴ and His⁹⁴⁶.

[0083] The DPP9 amino acid sequence shows the closest relatedness to DPP8, having 77% amino acid similarity and 60% amino acid identity. The relatedness to DPPIV is 25% amino acid identity and 47% amino acid similarity. The % similarity was determined by use of the program/algorithm “GAP” which is available from Genetics Computer Group (GCG), Wisconsin.

[0084] DPP9 mRNA Expression Studies

[0085] DPP4-like-2a was used to probe a Human Master RNA Blot™ (CLONTECH Laboratories Inc., USA) to study DPP9 tissue expression and the relative levels of DPP9 mRNA expression.

[0086] The DPP4-like-2a fragment hybridised to all tissue mRNA samples on the blot. The hybridisation also indicated high levels of DPP9 expression in most of the tissues samples on the blot (data not shown).

[0087] The DPP4-like-2a fragment was then used to probe two Multiple Tissue Northern Blots™ (CLONTECH Laboratories Inc., USA) to examine the mRNA expression and to determine the size of DPP9 mRNA transcript.

[0088] The autoradiographs of the DPP9 Multiple Tissue Northern blot are shown in FIG. 8. The DPP9 transcript was seen in all tissues examined confirming the results obtained from the Master RNA blot. A single major transcript 4.4 kb in size was seen in all tissues represented on two Blots after 16 hours of exposure. Weak bands could also be seen in some tissues after 6 hours of exposure. The DPP9 transcript was smaller than the 5.1 kb mRNA transcript of DPP8. A minor, very weak transcript 4.8 kb in size was also seen in the spleen, pancreas, peripheral blood leukocytes and heart. The highest mRNA expression was observed in the spleen and heart. Of all tissues examined the thymus had the least DPP9 mRNA expression. The Multiple Tissue Northern Blots were also probed with a β-actin positive control. A 2.0 kb band was seen in all tissues. In addition as expected a 1.8 kb β-actin band was seen in heart and skeletal muscle.

[0089] Rat DPP9 Expression

[0090] A Rat Multiple Tissue Northern Blot (CLONTECH Laboratories, Inc., USA;catalogue #: 7764-1) was hybridised with a human DPP9 radioactively labeled probe, made using Megaprime DNA Labeling kit and [³²P] dCTP (Amersham International plc, Amersham, UK). The DPP9 PCR product used to make the probe was generated using Met3F (GGCTGAGAG GAT GGCCACCAC CGGG) as the forward primer and GDD 3′end (GGGCGGGACAAAGTGC CTCACTGG) as the reverse primer. The hybridisation was carried out according to the manufacturers' instructions at 60° C. to detect cross-species hybridisation. After overnight hybridization the blot was washed at room temperature (2×SSC, 0.1% SDS) then at 40° C.(0.1×SSC, 0.1% SDS).

[0091] The human cDNA probe identified two bands in all tissues examined except in testes. A major transcript of 4 kb in size was seen in all tissues except testes. This 4 kb transcript was strongly expressed in the liver, heart and brain. A second weaker transcript 5.5 kb in size was present in all tissues except skeletal muscle and testes. However in the brain the 5.5 kb transcript was expressed at a higher level than the 4.4 kb transcript. In the testes only one transcript approximately 3.5 kb in size was detected. Thus, rat DPP9 mRNA hybridised with a human DPP9 probe indicating significant homology between DPP9 of the two species. The larger 5.5 kb transcript observed may be due to crosshybridisation to rat DPP8.

[0092] Mouse DPP9 Expression

[0093] A Unigene cluster for Mouse DPP9 was identified (UniGene Cluster Mm.33185) by homology to human DPP9. An analysis of expressed sequence tags contained in this cluster and mouse genomic sequence (AC026385) for Chromosome 17 with the Sequencher™ version 3.0 computer program revealed the nucleotide sequence shown in FIG. 9. This 3517 bp cDNA encodes a 869 aa mouse DPP9 protein (missing N-terminus) with 91% amino acid identity and 94% amino acid similarity to human DPP9. The mouse DPP9 amino acid sequence also has the residues required for enzyme activity, Ser, Asp and His and the two Glu residues.

[0094] The primers mgdd-prlF (5′ACCTGGGAGGAAGCACCCCACTGTG3′) and mgdd-pr4R (5′TTCCACCTGGTCCTCAATCTCC3′) were designed from this sequence and used to amplify a 452 bp product as expected from liver mouse cDNA, as described below.

[0095] RNA Preparation

[0096] B57Bl6 mice underwent carbon tetrachloride treatment to induce liver fibrosis. Liver RNA were prepared from snap-frozen tissues using the TRIzol® Reagent and other standard methods.

[0097] cDNA Synthesis

[0098] 2 μg of liver RNA was reverse-transcribed using SuperScript II RNase H-Reverse Transcriptase (Gibco BRL).

[0099] PCR

[0100] PCR using mDPP9-1F (ACCTGGGAGGAAGCACCCCACTGTG) as the forward primer and mDPP9-2R (CTCTCCACATGCAGGGCTACAGAC) as the reverse primer was used to synthesise a 550 base pair mouse DPP9 fragment. The PCR products were generated using AmpliTaq Gold® DNA Polymerase. The PCR was performed as follows: denaturation at 95° C. for 10 min, followed by 35 cycles of denaturation at 95° C. for 30 seconds, primer annealing at 60° C. for 30 seconds, and an extension 720 C for 1 min.

[0101] Southern Blot

[0102] DPP9 PCR products from six mice as well as the largest human DPP9 PCR product were run on a 1% agarose gel. The DNA on the gel was then denatured using 0.4 M NaOH and transferred onto a Hybond-N+ membrane (Amersham International plc, Amersham, UK). The largest human DPP9 PCR product was radiolabeled using the Megaprime DNA Labeling kit and [32^(P)] dCTP (Amersham International plc, Amersham, UK). Unincorporated label was removed using a NAP column (Pharmacia Biotech, Sweden) and the denatured probe was incubated with the membrane for 2 hours at 60° C. in Express Hybridisation solution (CLONTECH Laboratories, Inc., USA). (FIG. 12). Thus, DPP9 mRNA of appropriate size was detected in fibrotic mouse liver using rt-PCR. Furthermore, the single band of mouse DPP9 cDNA hybridised with a human DPP9 probe indicating significant homology between DPP9 of the two species.

REFERENCES

[0103] 1. Abbott C A, G W McCaughan & M D Gorrell 1999 Two highly conserved glutamic acid residues in the predicted beta propeller domain of dipeptidyl peptidase IV are required for its enzyme activity FEBS Letters 458: 278-84.

[0104] 2. Abbott C A, D M T Yu, G W McCaughan & M D Gorrell 2000 Post proline peptidases having DP IV like enzyme activity Advances in Experimental Medicine and Biology 477: 103-9.

[0105] 3. McCaughan G W, M D Gorrell, G A Bishop, C A Abbott, N A Shackel, P H McGuinness, M T Levy, A F Sharland, D G Bowen, D Yu, L Slaitini, W B Church & J Napoli 2000 Molecular pathogenesis of liver disease: an approach to hepatic inflammation, cirrhosis and liver transplant tolerance Immunological Reviews 174: 172-91.

[0106] 4. Scanlan M J, B K Raj, B Calvo, P Garin-Chesa, M P Sanz-Moncasi, J H Healey, L J Old & W J Rettig 1994 Molecular cloning of fibroblast activation protein alpha, a member of the serine protease family selectively expressed in stromal fibroblasts of epithelial cancers Proceedings of the National Academy of Sciences United States of America 91: 5657-61.

[0107] 5. Handbook of Proteolytic Enzymes. Barrett A J, N D Rawlings & J F Woess. 1998., London: Academic Press. 1666.

[0108] 6. Jacotot E, C Callebaut, J Blanco, B Krust, K Neubert, A Barth & A G Hovanessian 1996 Dipeptidyl-peptidase IV-beta, a novel form of cell-surface-expressed protein with dipeptidyl-peptidase IV activity European Journal of Biochemistry 239: 248-58.

[0109] 7. Rawlings N D & A J Barrett 1999 MEROPS: the peptidase database Nucleic Acids Research 27: 325-31.

[0110] 8. Park J E, M C Lenter, R N Zimmermann, P Garin-Chesa, L J Old & W J Rettig 1999 Fibroblast activation protein: A dual-specificity serine protease expressed in reactive human tumor stromal fibroblasts Journal of Biological Chemistry 274: 36505-12.

[0111] 9. Levy M T, G W McCaughan, C A Abbott, J E Park, A M Cunningham, E Muller, W J Rettig & M D Gorrell 1999 Fibroblast activation protein: A cell surface dipeptidyl peptidase and gelatinase expressed by stellate cells at the tissue remodelling interface in human cirrhosis Hepatology 29: 1768-78.

[0112] 10. De Meester I, S Korom, J Van Damme & S Scharpé 1999 CD26, let it cut or cut it down Immunology Today 20: 367-75.

[0113] 11. Natural substrates of dipeptidyl peptidase IV. De Meester I, C Durinx, G Bal, P Proost, S Struyf, F Goossens, K Augustyns & S Scharpé. 2000, in Cellular Peptidases in Immune Functions and Diseases II, J Langner & S Ansorge, Editor. Kluwer: New York. p. 67-88.

[0114] 12. Mentlein R 1999 Dipeptidyl-peptidase IV (CD26): role in the inactivation of regulatory peptides Regulatory Peptides 85: 9-24.

[0115] 13. Morrison M E, S Vijayasaradhi, D Engelstein, A P Albino & A N Houghton 1993 A marker for neoplastic progression of human melanocytes is a cell surface ectopeptidase Journal of Experimental Medicine 177: 1135-43.

[0116] 14. Mueller S C, G Ghersi, S K Akiyama, Q X A Sang, L Howard, M Pineiro-Sanchez, H Nakahara, Y Yeh & W T Chen 1999 A novel protease-docking function of integrin at invadopodia Journal of Biological Chemistry 274: 24947-52.

[0117] 15. Holst J J & C F Deacon 1998 Inhibition of the activity of dipeptidyl-peptidase IV as a treatment for type 2 diabetes Diabetes 47: 1663-70.

[0118] 16. Marguet D, L Baggio, T Kobayashi, A M Bernard, M Pierres, P F Nielsen, U Ribel, T Watanabe, D J Drucker & N Wagtmann 2000 Enhanced insulin secretion and improved glucose tolerance in mice lacking CD26 Proceedings of the National Academy of Sciences of the United States of America 97: 6874-9.

[0119] 17. Ohtsuki T, H Tsuda & C Morimoto 2000 Good or evil: CD26 and HIV infection Journal of Dermatological Science 22: 152-60.

[0120] 18. Wesley U V, A P Albino, S Tiwari & A N Houghton 1999 A role for dipeptidyl peptidase IV in suppressing the malignant phenotype of melanocytic cells Journal of Experimental Medicine 190: 311-22.

[0121] 19. Korom S, I De Meester, T H W Stadlbauer, A Chandraker, M Schaub, M H Sayegh, A Belyaev, A Haemers, S Scharpé & J W Kupiecweglinski 1997 Inhibition of CD26/dipeptidyl peptidase IV activity in vivo prolongs cardiac allograft survival in rat recipients Transplantation 63: 1495-500.

[0122] 20. Tanaka S, T Murakami, H Horikawa, M Sugiura, K Kawashima & T Sugita 1997 Suppression of arthritis by the inhibitors of dipeptidyl peptidase IV International Journal of Immunopharmacology 19: 15-24.

[0123] 21. Augustyns K, G Bal, G Thonus, A Belyaev, X M Zhang, W Bollaert, A M Lambeir, C Durinx, F Goossens & A Haemers 1999 The unique properties of dipeptidyl-peptidase IV (DPP IV/CD26) and the therapeutic potential of DPP IV inhibitors Current Medicinal Chemistry 6: 311-27.

[0124] 22. Hinke S A, J A Pospisilik, H U Demuth, S Mannhart, K Kuhn-Wache, T Hoffmannn, E Nishimura, R A Pederson & C H S McIntosh 2000 Dipeptidyl peptidase IV (DPIV/CD26) degradation of glucagon—Characterization of glucagon degradation products and DPIV-resistant analogs Journal of Biological Chemistry 275: 3827-34.

[0125] 23. Korom S, I De Meester, A Coito, E Graser, H D Volk, K Schwemmle, S Scharpe & J W Kupiec-Weglinski 1999 Immunomodulatory influence of CD26 dipeptidylpeptidase IV during acute and accelerated rejection Langenbecks Archives of Surgery 1: 241-5.

[0126] 24. Tavares W, D J Drucker & P L Brubaker 2000 Enzymatic- and renal-dependent catabolism of the intestinotropic hormone glucagon-like peptide-2 in rats American Journal of Physiology Endocrinology and Metabolism 278: E134-E9.

[0127] 25. David F, AM Bernard, M Pierres & D Marguet 1993 Identification of serine 624, aspartic acid 702, and histidine 734 as the catalytic triad residues of mouse dipeptidyl-peptidase IV (CD26). A member of a novel family of nonclassical serine hydrolases J Biol Chem 268: 17247-52.

[0128] 26. Ogata S, Y Misumi, E Tsuji, N Takami, K Oda & Y Ikehara 1992 Identification of the active site residues in dipeptidyl peptidase IV by affinity labeling and site-directed mutagenesis Biochemistry 31: 2582-7.

[0129] 27. Dipeptidyl peptidase IV (DPPIV/CD26): biochemistry and control of cell-surface expression. Trugnan G, T Ait-Slimane, F David, L Baricault, T Berbar, C Lenoir & C Sapin. 1997, in Cell-Surface Peptidases in Health and Disease, A J Kenny & C M Boustead, Editor. BIOS Scientific Publishers: Oxford. p. 203-17.

[0130] 28. Steeg C, U Hartwig & B Fleischer 1995 Unchanged signaling capacity of mutant CD26/dipeptidylpeptidase IV molecules devoid of enzymatic activity Cell Immunol 164: 311-5.

[0131] 29. Fulop V, Z Bocskei & L Polgar 1998 Prolyl oligopeptidase—an unusual beta-propeller domain regulates proteolysis Cell 94: 161-70.

[0132] 30. Ausubel F M, R Brent, R E Kingston, D D Moore, J G Seidman, J A Smith & K Struhl, ed. Current Protocols in Molecular Biology. 1998, John Wiley & Sons: USA.

[0133] 31. Molecular cloning: a laboratory manual. Sambrook J, E F Fritsch & T Maniatis. 1989. 2nd ed., Cold Spring Harbor: Cold Spring Harbor Laboratory Press.

[0134] 32. Augustyns K J L, A M Lambeir, M Borloo, I Demeester, I Vedernikova, G Vanhoof, D Hendriks, S Scharpe & A Haemers 1997 Pyrrolidides—synthesis and structure-activity relationship as inhibitors of dipeptidyl peptidase IV European Journal of Medicinal Chemistry 32: 301-9.

[0135] 33. Stockel-Maschek A, C Mrestani-Klaus, B Stiebitz, H U Demuth & K Neubert 2000 Thioxo amino acid pyrrolidides and thiazolidides: new inhibitors of proline specific peptidases Biochimica et Biophysica Acta—Protein Structure & Molecular Enzymology 1479: 15-31.

[0136] 34. Schön, I Born, H U Demuth, J Faust, K Neubert, T Steinmetzer, A Barth & S Ansorge 1991 Dipeptidyl peptidase IV in the immune system. Effects of specific enzyme inhibitors on activity of dipeptidyl peptidase IV and proliferation of human lymphocytes Biological Chemistry Hoppe Seyler 372: 305-11.

[0137] 35. Coligan J E, A M Kruisbeek, D H Margulies, E M Shevach & W Strober, eds. Current Protocols in Immunology. 1998, John Wiley & Sons: USA.

[0138] 36. Fibroblast activation protein. Rettig W J. 1998, in Handbook of Proteolytic Enzymes, A J Barrett, N D Rawlings & J F Woessner, Editor. Academic Press: San Diego. p. 387-9.

[0139]

1 8 1 3000 DNA Homo sapiens 1 cggcgggtcc cctgtgtccg ccgcggctgt cgtcccccgc tcccgccact tccggggtcg 60 cagtcccggg catggagccg cgaccgtgag gcgccgctgg acccgggacg acctgcccag 120 tccggccgcc gccccacgtc ccggtctgtg tcccacgcct gcagctggaa tggaggctct 180 ctggaccctt tagaaggcac ccctgccctc ctgaggtcag ctgagcggtt aatgcggaag 240 gttaagaaac tgcgcctgga caaggagaac accggaagtt ggagaagctt ctcgctgaat 300 tccgaggggg ctgagaggat ggccaccacc gggaccccaa cggccgaccg aggcgacgca 360 gccgccacag atgacccggc cgcccgcttc caggtgcaga agcactcgtg ggacgggctc 420 cggagcatca tccacggcag ccgcaagtac tcgggcctca ttgtcaacaa ggcgccccac 480 gacttccagt ttgtgcagaa gacggatgag tctgggcccc actcccaccg cctctactac 540 ctgggaatgc catatggcag ccgggagaac tccctcctct actctgagat tcccaagaag 600 gtccggaaag aggctctgct gctcctgtcc tggaagcaga tgctggatca tttccaggcc 660 acgccccacc atggggtcta ctctcgggag gaggagctgc tgagggagcg gaaacgcctg 720 ggggtcttcg gcatcacctc ctacgacttc cacagcgaga gtggcctctt cctcttccag 780 gccagcaaca gcctcttcca ctgccgcgac ggcggcaaga acggcttcat ggtgtcccct 840 atgaaaccgc tggaaatcaa gacccagtgc tcagggcccc ggatggaccc caaaatctgc 900 cctgccgacc ctgccttctt ctccttcaac aataacagcg acctgtgggt ggccaacatc 960 gagacaggcg aggagcggcg gctgaccttc tgccaccaag gtttatccaa tgtcctggat 1020 gaccccaagt ctgcgggtgt ggccaccttc gtcatacagg aagagttcga ccgcttcact 1080 gggtactggt ggtgccccac agcctcctgg gaaggttcag agggcctcaa gacgctgcga 1140 atcctgtatg aggaagtcga tgagtccgag gtggaggtca ttcacgtccc ctctcctgcg 1200 ctagaagaaa ggaagacgga ctcgtatcgg taccccagga caggcagcaa gaatcccaag 1260 attgccttga aactggctga gttccagact gacagccagg gcaagatcgt ctcgacccag 1320 gagaaggagc tggtgcagcc cttcagctcg ctgttcccga aggtggagta catcgccagg 1380 gccgggtgga cccgggatgg caaatacgcc tgggccatgt tcctggaccg gccccagcag 1440 tggctccagc tcgtcctcct ccccccggcc ctgttcatcc cgagcacaga gaatgaggag 1500 cagcggctag cctctgccag agctgtcccc aggaatgtcc agccgtatgt ggtgtacgag 1560 gaggtcacca acgtctggat caatgttcat gacatcttct atcccttccc ccaatcagag 1620 ggagaggacg agctctgctt tctccgcgcc aatgaatgca agaccggctt ctgccatttg 1680 tacaaagtca ccgccgtttt aaaatcccag ggctacgatt ggagtgagcc cttcagcccc 1740 ggggaagatg aatttaagtg ccccattaag gaagagattg ctctgaccag cggtgaatgg 1800 gaggttttgg cgaggcacgg ctccaagatc tgggtcaatg aggagaccaa gctggtgtac 1860 ttccagggca ccaaggacac gccgctggag caccacctct acgtggtcag ctatgaggcg 1920 gccggcgaga tcgtacgcct caccacgccc ggcttctccc atagctgctc catgagccag 1980 aacttcgaca tgttcgtcag ccactacagc agcgtgagca cgccgccctg cgtgcacgtc 2040 tacaagctga gcggccccga cgacgacccc ctgcacaagc agccccgctt ctgggctagc 2100 atgatggagg cagccagctg ccccccggat tatgttcctc cagagatctt ccatttccac 2160 acgcgctcgg atgtgcggct ctacggcatg atctacaagc cccacgcctt gcagccaggg 2220 aagaagcacc ccaccgtcct ctttgtatat ggaggccccc aggtgcagct ggtgaataac 2280 tccttcaaag gcatcaagta cttgcggctc aacacactgg cctccctggg ctacgccgtg 2340 gttgtgattg acggcagggg ctcctgtcag cgagggcttc ggttcgaagg ggccctgaaa 2400 aaccaaatgg gccaggtgga gatcgaggac caggtggagg gcctgcagtt cgtggccgag 2460 aagtatggct tcatcgacct gagccgagtt gccatccatg gctggtccta cgggggcttc 2520 ctctcgctca tggggctaat ccacaagccc caggtgttca aggtggccat cgcgggtgcc 2580 ccggtcaccg tctggatggc ctacgacaca gggtacactg agcgctacat ggacgtccct 2640 gagaacaacc agcacggcta tgaggcgggt tccgtggccc tgcacgtgga gaagctgccc 2700 aatgagccca accgcttgct tatcctccac ggcttcctgg acgaaaacgt gcactttttc 2760 cacacaaact tcctcgtctc ccaactgatc cgagcaggga aaccttacca gctccagatc 2820 taccccaacg agagacacag tattcgctgc cccgagtcgg gcgagcacta tgaagtcacg 2880 ttactgcact ttctacagga atacctctga gcctgcccac cgggagccgc cacatcacag 2940 cacaagtggc tgcagcctcc gcggggaacc aggcgggagg gactgagtgg cccgcgggcc 3000 2 969 PRT Homo sapiens 2 Arg Arg Val Pro Cys Val Arg Arg Gly Cys Arg Pro Pro Leu Pro Pro 1 5 10 15 Leu Pro Gly Ser Gln Ser Arg Ala Trp Ser Arg Asp Arg Glu Ala Pro 20 25 30 Leu Asp Pro Gly Arg Pro Ala Gln Ser Gly Arg Arg Pro Thr Ser Arg 35 40 45 Ser Val Ser His Ala Cys Ser Trp Asn Gly Gly Ser Leu Asp Pro Leu 50 55 60 Glu Gly Thr Pro Ala Leu Leu Arg Ser Ala Glu Arg Leu Met Arg Lys 65 70 75 80 Val Lys Lys Leu Arg Leu Asp Lys Glu Asn Thr Gly Ser Trp Arg Ser 85 90 95 Phe Ser Leu Asn Ser Glu Gly Ala Glu Arg Met Ala Thr Thr Gly Thr 100 105 110 Pro Thr Ala Asp Arg Gly Asp Ala Ala Ala Thr Asp Asp Pro Ala Ala 115 120 125 Arg Phe Gln Val Gln Lys His Ser Trp Asp Gly Leu Arg Ser Ile Ile 130 135 140 His Gly Ser Arg Lys Tyr Ser Gly Leu Ile Val Asn Lys Ala Pro His 145 150 155 160 Asp Phe Gln Phe Val Gln Lys Thr Asp Glu Ser Gly Pro His Ser His 165 170 175 Arg Leu Tyr Tyr Leu Gly Met Pro Tyr Gly Ser Arg Glu Asn Ser Leu 180 185 190 Leu Tyr Ser Glu Ile Pro Lys Lys Val Arg Lys Glu Ala Leu Leu Leu 195 200 205 Leu Ser Trp Lys Gln Met Leu Asp His Phe Gln Ala Thr Pro His His 210 215 220 Gly Val Tyr Ser Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg Leu 225 230 235 240 Gly Val Phe Gly Ile Thr Ser Tyr Asp Phe His Ser Glu Ser Gly Leu 245 250 255 Phe Leu Phe Gln Ala Ser Asn Ser Leu Phe His Cys Arg Asp Gly Gly 260 265 270 Lys Asn Gly Phe Met Val Ser Pro Met Lys Pro Leu Glu Ile Lys Thr 275 280 285 Gln Cys Ser Gly Pro Arg Met Asp Pro Lys Ile Cys Pro Ala Asp Pro 290 295 300 Ala Phe Phe Ser Phe Asn Asn Asn Ser Asp Leu Trp Val Ala Asn Ile 305 310 315 320 Glu Thr Gly Glu Glu Arg Arg Leu Thr Phe Cys His Gln Gly Leu Ser 325 330 335 Asn Val Leu Asp Asp Pro Lys Ser Ala Gly Val Ala Thr Phe Val Ile 340 345 350 Gln Glu Glu Phe Asp Arg Phe Thr Gly Tyr Trp Trp Cys Pro Thr Ala 355 360 365 Ser Trp Glu Gly Ser Gln Gly Leu Lys Thr Leu Arg Ile Leu Tyr Glu 370 375 380 Glu Val Asp Glu Ser Glu Val Glu Val Ile His Val Pro Ser Pro Ala 385 390 395 400 Leu Glu Glu Arg Lys Thr Asp Ser Tyr Arg Tyr Pro Arg Thr Gly Ser 405 410 415 Lys Asn Pro Lys Ile Ala Leu Lys Leu Ala Glu Phe Gln Thr Asp Ser 420 425 430 Gln Gly Lys Ile Val Ser Thr Gln Glu Lys Glu Leu Val Gln Pro Phe 435 440 445 Ser Ser Leu Phe Pro Lys Val Glu Tyr Ile Ala Arg Ala Gly Trp Thr 450 455 460 Arg Asp Gly Lys Tyr Ala Trp Ala Met Phe Leu Asp Arg Pro Gln Gln 465 470 475 480 Trp Leu Gln Leu Val Leu Leu Pro Pro Ala Leu Phe Ile Pro Ser Thr 485 490 495 Glu Asn Glu Glu Gln Arg Leu Ala Ser Ala Arg Ala Val Pro Arg Asn 500 505 510 Val Gln Pro Tyr Val Val Tyr Glu Glu Val Thr Asn Val Trp Ile Asn 515 520 525 Val His Asp Ile Phe Tyr Pro Phe Pro Gln Ser Glu Gly Glu Asp Glu 530 535 540 Leu Cys Phe Leu Arg Ala Asn Glu Cys Lys Thr Gly Phe Cys His Leu 545 550 555 560 Tyr Lys Val Thr Ala Val Leu Lys Ser Gln Gly Tyr Asp Trp Ser Glu 565 570 575 Pro Phe Ser Pro Gly Glu Asp Glu Phe Lys Cys Pro Ile Lys Glu Glu 580 585 590 Ile Ala Leu Thr Ser Gly Glu Trp Glu Val Leu Ala Arg His Gly Ser 595 600 605 Lys Ile Trp Val Asn Glu Glu Thr Lys Leu Val Tyr Phe Gln Gly Thr 610 615 620 Lys Asp Thr Pro Leu Glu His His Leu Tyr Val Val Ser Tyr Glu Ala 625 630 635 640 Ala Gly Glu Ile Val Arg Leu Thr Thr Pro Gly Phe Ser His Ser Cys 645 650 655 Ser Met Ser Gln Asn Phe Asp Met Phe Val Ser His Tyr Ser Ser Val 660 665 670 Ser Thr Pro Pro Cys Val His Val Tyr Lys Leu Ser Gly Pro Asp Asp 675 680 685 Asp Pro Leu His Lys Gln Pro Arg Phe Trp Ala Ser Met Met Glu Ala 690 695 700 Ala Ser Cys Pro Pro Asp Tyr Val Pro Pro Glu Ile Phe His Phe His 705 710 715 720 Thr Arg Ser Asp Val Arg Leu Tyr Gly Met Ile Tyr Lys Pro His Ala 725 730 735 Leu Gln Pro Gly Lys Lys His Pro Thr Val Leu Phe Val Tyr Gly Gly 740 745 750 Pro Gln Val Gln Leu Val Asn Asn Ser Phe Lys Gly Ile Lys Tyr Leu 755 760 765 Arg Leu Asn Thr Leu Ala Ser Leu Gly Tyr Ala Val Val Val Ile Asp 770 775 780 Gly Arg Gly Ser Cys Gln Arg Gly Leu Arg Phe Glu Gly Ala Leu Lys 785 790 795 800 Asn Gln Met Gly Gln Val Glu Ile Glu Asp Gln Val Glu Gly Leu Gln 805 810 815 Phe Val Ala Glu Lys Tyr Gly Phe Ile Asp Leu Ser Arg Val Ala Ile 820 825 830 His Gly Trp Ser Tyr Gly Gly Phe Leu Ser Leu Met Gly Leu Ile His 835 840 845 Lys Pro Gln Val Phe Lys Val Ala Ile Ala Gly Ala Pro Val Thr Val 850 855 860 Trp Met Ala Tyr Asp Thr Gly Tyr Thr Glu Arg Tyr Met Asp Val Pro 865 870 875 880 Glu Asn Asn Gln His Gly Tyr Glu Ala Gly Ser Val Ala Leu His Val 885 890 895 Glu Lys Leu Pro Asn Glu Pro Asn Arg Leu Leu Ile Leu His Gly Phe 900 905 910 Leu Asp Glu Asn Val His Phe Phe His Thr Asn Phe Leu Val Ser Gln 915 920 925 Leu Ile Arg Ala Gly Lys Pro Tyr Gln Leu Gln Ile Tyr Pro Asn Glu 930 935 940 Arg His Ser Ile Arg Cys Pro Glu Ser Gly Glu His Tyr Glu Val Thr 945 950 955 960 Leu Leu His Phe Leu Gln Glu Tyr Leu 965 3 3287 DNA Mus musculus 3 ccatcacagg agccccagag gatgtgcagc ggggtctccc cagttgagca ggtggccgca 60 ggggacatgg atgacacggc agcacgcttc tgtgtgcaga agcactcgtg ggatgggctg 120 cgtagcatta tccacggcag tcgcaagtcc tcgggcctca ttgtcagcaa ggccccccac 180 gacttccagt ttgtgcagaa gcctgacgag tctggccccc actctcaccg tctctattac 240 ctcggaatgc cttacggcag ccgtgagaac tccctcctct actccgagat ccccaagaaa 300 gtgcggaagg aggccctgct gctgctgtcc tggaagcaga tgctggacca cttccaggcc 360 acaccccacc atggtgtcta ctcccgagag gaggagctac tgcgggagcg caagcgcctg 420 ggcgtcttcg gaatcacctc ttatgacttc cacagtgaga gcggcctctt cctcttccag 480 gccagcaata gcctgttcca ctgcagggat ggtggcaaga atggctttat ggtgtccccg 540 atgaagccac tggagatcaa gactcagtgt tctgggccac gcatggaccc caaaatctgc 600 cccgcagacc ctgccttctt ttccttcatc aacaacagtg atctgtgggt ggcaaacatc 660 gagactgggg aggaacggcg gctcaccttc tgtcaccagg gttcagctgg tgtcctggac 720 aatcccaaat cagcaggcgt ggccaccttt gtcatccagg aggagttcga ccgcttcact 780 gggtgctggt ggtgccccac ggcctcttgg gaaggctccg aaggtctcaa gacgctgcgc 840 atcctatatg aggaagtgga cgagtctgaa gtggaggtca ttcatgtgcc ctcccccgcc 900 ctggaggaga ggaagacgga ctcctaccgc taccccagga caggcagcaa gaaccccaag 960 attgccctga agctggctga gctccagacg gaccatcagg gcaaaatcgt gtcaagctgc 1020 gagaaggaac tggtacagcc attcagctcc cttttcccca aagtggagta catcgcccgg 1080 gctggctgga cacgggacgg caaatatgcc tgggccatgt tcctggaccg tccccagcaa 1140 cggcttcagc ttgtcctcct gccccctgct ctcttcatcc cggccgttga gagtgaggcc 1200 cagcggcagg cagctgccag agccgtcccc aagaatgtgc agccctttgt catctatgaa 1260 gaagtcacca atgtctggat caacgtccac gacatcttcc acccgtttcc tcaggctgag 1320 ggccagcagg acttttgttt ccttcgtgcc aacgaatgca agactggctt ctgccacctg 1380 tacagggtca cagtggaact taaaaccaag gactatgact ggacggaacc cctcagccct 1440 acagaaggtg agtttaagtg ccccatcaag gaggaggtcg ccctgaccag tggcgagtgg 1500 gaggtcttgt cgaggcatgg ctccaagatc tgggtcaacg agcagacgaa gctggtgtac 1560 tttcaaggta caaaggacac accgctggaa catcacctct atgtggtcag ctacgagtca 1620 gcaggcgaga tcgtgcggct caccacgctc ggcttctccc acagctgctc catgagccag 1680 agcttcgaca tgttcgtgag tcactacagc agtgtgagca cgccaccctg tgtacatgtg 1740 tacaagctga gcggccccga tgatgaccca ctgcacaagc aaccacgctt ctgggccagc 1800 atgatggagg cagccaattg ccccccagac tatgtgcccc ctgagatctt ccacttccac 1860 acccgtgcag acgtgcagct ctacggcatg atctacaagc cacacaccct gcaacctggg 1920 aggaagcacc ccactgtgct ctttgtctat gggggcccac aggtgcagtt ggtgaacaac 1980 tcctttaagg gcatcaaata cctgcggcta aatacactgg catccttggg ctatgctgtg 2040 gtggtgatcg atggtcgggg ctcctgtcag cggggcctgc acttcgaggg ggccctgaaa 2100 aatcaaatgg gccaggtgga gattgaggac caggtggaag gcttgcagta cgtggctgag 2160 aagtatggct tcattgactt gagccgagtc gccatccatg gctggtccta cggcggcttc 2220 ctctcactca tggggctcat ccacaagcca caagtgttca aggtagccat tgcgggcgct 2280 cctgtcactg tgtggatggc ctatgacaca gggtacacgg aacgatacat ggatgtcccc 2340 gaaaataacc agcaaggcta tgaggcaggg tctgtagccc tgcatgtgga gaagctgccc 2400 aatgagccta accgcctgct tatcctccac ggcttcctgg acgagaacgt tcacttcttc 2460 cacacaaatt tcctggtgtc ccagctgatc cgagcaggaa agccatacca gcttcagatc 2520 tacccaaacg agagacatag catccgctgc cgcgagtccg gagagcatta cgaggtgacg 2580 ctgctgcact ttctgcagga acacctgtga cctcagtccc gactcctgac gccaccgctg 2640 ctcttcttgc gtttttgtaa tcttttcatt tttgaagctt ccaatttgct tgctgctgct 2700 gctgcctggg ggccaggaca gaggtagtgg cggcccccat gccgccctcc ttgagctggt 2760 gaggagaagt cgccattgag cacacaacct ccaccagact gccatggccc cgaacctgca 2820 attccatcct agcgcagaag catgtgcctg ccacctgctg cccctgcaga gtcatgtgtg 2880 tttgtggtgg gcattttaaa taattattta aaagacagga agtaagcggt accgagcaat 2940 gaaactgaag gtacagcact gggcgtctgg ggaccccacg ctctcccaac gcccagacta 3000 tgtggagctg ccaagcccct gtctgggcac ctctgccctg cctgtctgct gcccggatcc 3060 tcctcactta gcacctaggg gtgtcagggt cgggagtagg acctgtcctg acctcagggt 3120 tatatatagc ccttccccac tccctcctac gagagttctg gcataaagaa gtaaaaaaaa 3180 aaaaaaaaaa aacaaacaaa aaaaccaaac cacctctaca tattatggaa agaaaatatt 3240 tttgtcaatt cttattcttt tataattatg tggtatgtag actcatt 3287 4 869 PRT Mus musculus 4 Pro Ser Gln Glu Pro Gln Arg Met Cys Ser Gly Val Ser Pro Val Glu 1 5 10 15 Gln Val Ala Ala Gly Asp Met Asp Asp Thr Ala Ala Arg Phe Cys Val 20 25 30 Gln Lys His Ser Trp Asp Gly Leu Arg Ser Ile Ile His Gly Ser Arg 35 40 45 Lys Ser Ser Gly Leu Ile Val Ser Lys Ala Pro His Asp Phe Gln Phe 50 55 60 Val Gln Lys Pro Asp Glu Ser Gly Pro His Ser His Arg Leu Tyr Tyr 65 70 75 80 Leu Gly Met Pro Tyr Gly Ser Arg Glu Asn Ser Leu Leu Tyr Ser Glu 85 90 95 Ile Pro Lys Lys Val Arg Lys Glu Ala Leu Leu Leu Leu Ser Trp Lys 100 105 110 Gln Met Leu Asp His Phe Gln Ala Thr Pro His His Gly Val Tyr Ser 115 120 125 Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg Leu Gly Val Phe Gly 130 135 140 Ile Thr Ser Tyr Asp Phe His Ser Glu Ser Gly Leu Phe Leu Phe Gln 145 150 155 160 Ala Ser Asn Ser Leu Phe His Cys Arg Asp Gly Gly Lys Asn Gly Phe 165 170 175 Met Val Ser Pro Met Lys Pro Leu Glu Ile Lys Thr Gln Cys Ser Gly 180 185 190 Pro Arg Met Asp Pro Lys Ile Cys Pro Ala Asp Pro Ala Phe Phe Ser 195 200 205 Phe Ile Asn Asn Ser Asp Leu Trp Val Ala Asn Ile Glu Thr Gly Glu 210 215 220 Glu Arg Arg Leu Thr Phe Cys His Gln Gly Ser Ala Gly Val Leu Asp 225 230 235 240 Asn Pro Lys Ser Ala Gly Val Ala Thr Phe Val Ile Gln Glu Glu Phe 245 250 255 Asp Arg Phe Thr Gly Cys Trp Trp Cys Pro Thr Ala Ser Trp Glu Gly 260 265 270 Ser Glu Gly Leu Lys Thr Leu Arg Ile Leu Tyr Glu Glu Val Asp Glu 275 280 285 Ser Glu Val Glu Val Ile His Val Pro Ser Pro Ala Leu Glu Glu Arg 290 295 300 Lys Thr Asp Ser Tyr Arg Tyr Pro Arg Thr Gly Ser Lys Asn Pro Lys 305 310 315 320 Ile Ala Leu Lys Leu Ala Glu Leu Gln Thr Asp His Gln Gly Lys Ile 325 330 335 Val Ser Ser Cys Glu Lys Glu Leu Val Gln Pro Phe Ser Ser Leu Phe 340 345 350 Pro Lys Val Glu Tyr Ile Ala Arg Ala Gly Trp Thr Arg Asp Gly Lys 355 360 365 Tyr Ala Trp Ala Met Phe Leu Asp Arg Pro Gln Gln Arg Leu Gln Leu 370 375 380 Val Leu Leu Pro Pro Ala Leu Phe Ile Pro Ala Val Glu Ser Glu Ala 385 390 395 400 Gln Arg Gln Ala Ala Ala Arg Ala Val Pro Lys Asn Val Gln Pro Phe 405 410 415 Val Ile Tyr Glu Glu Val Thr Asn Val Trp Ile Asn Val His Asp Ile 420 425 430 Phe His Pro Phe Pro Gln Ala Glu Gly Gln Gln Asp Phe Cys Phe Leu 435 440 445 Arg Ala Asn Glu Cys Lys Thr Gly Phe Cys His Leu Tyr Arg Val Thr 450 455 460 Val Glu Leu Lys Thr Lys Asp Tyr Asp Trp Thr Glu Pro Leu Ser Pro 465 470 475 480 Thr Glu Gly Glu Phe Lys Cys Pro Ile Lys Glu Glu Val Ala Leu Thr 485 490 495 Ser Gly Glu Trp Glu Val Leu Ser Arg His Gly Ser Lys Ile Trp Val 500 505 510 Asn Glu Gln Thr Lys Leu Val Tyr Phe Gln Gly Thr Lys Asp Thr Pro 515 520 525 Leu Glu His His Leu Tyr Val Val Ser Tyr Glu Ser Ala Gly Glu Ile 530 535 540 Val Arg Leu Thr Thr Leu Gly Phe Ser His Ser Cys Ser Met Ser Gln 545 550 555 560 Ser Phe Asp Met Phe Val Ser His Tyr Ser Ser Val Ser Thr Pro Pro 565 570 575 Cys Val His Val Tyr Lys Leu Ser Gly Pro Asp Asp Asp Pro Leu His 580 585 590 Lys Gln Pro Arg Phe Trp Ala Ser Met Met Glu Ala Ala Asn Cys Pro 595 600 605 Pro Asp Tyr Val Pro Pro Glu Ile Phe His Phe His Thr Arg Ala Asp 610 615 620 Val Gln Leu Tyr Gly Met Ile Tyr Lys Pro His Thr Leu Gln Pro Gly 625 630 635 640 Arg Lys His Pro Thr Val Leu Phe Val Tyr Gly Gly Pro Gln Val Gln 645 650 655 Leu Val Asn Asn Ser Phe Lys Gly Ile Lys Tyr Leu Arg Leu Asn Thr 660 665 670 Leu Ala Ser Leu Gly Tyr Ala Val Val Val Ile Asp Gly Arg Gly Ser 675 680 685 Cys Gln Arg Gly Leu His Phe Glu Gly Ala Leu Lys Asn Gln Met Gly 690 695 700 Gln Val Glu Ile Glu Asp Gln Val Glu Gly Leu Gln Tyr Val Ala Glu 705 710 715 720 Lys Tyr Gly Phe Ile Asp Leu Ser Arg Val Ala Ile His Gly Trp Ser 725 730 735 Tyr Gly Gly Phe Leu Ser Leu Met Gly Leu Ile His Lys Pro Gln Val 740 745 750 Phe Lys Val Ala Ile Ala Gly Ala Pro Val Thr Val Trp Met Ala Tyr 755 760 765 Asp Thr Gly Tyr Thr Glu Arg Tyr Met Asp Val Pro Glu Asn Asn Gln 770 775 780 Gln Gly Tyr Glu Ala Gly Ser Val Ala Leu His Val Glu Lys Leu Pro 785 790 795 800 Asn Glu Pro Asn Arg Leu Leu Ile Leu His Gly Phe Leu Asp Glu Asn 805 810 815 Val His Phe Phe His Thr Asn Phe Leu Val Ser Gln Leu Ile Arg Ala 820 825 830 Gly Lys Pro Tyr Gln Leu Gln Ile Tyr Pro Asn Glu Arg His Ser Ile 835 840 845 Arg Cys Arg Glu Ser Gly Glu His Tyr Glu Val Thr Leu Leu His Phe 850 855 860 Leu Gln Glu His Leu 865 5 3120 DNA Homo sapiens 5 aagtgctaaa gcctccgagg ccaaggccgc tgctactgcc gccgctgctt cttagtgccg 60 cgttcgccgc ctgggttgtc accggcgccg ccgccgagga agccactgca accaggaccg 120 gagtggaggc ggcgcagcat gaagcggcgc aggcccgctc catagcgcac gtcgggacgg 180 tccgggcggg gccgggggga aggaaaatgc aacatggcag cagcaatgga aacagaacag 240 ctgggtgttg agatatttga aactgcggac tgtgaggaga atattgaatc acaggatcgg 300 cctaaattgg agccttttta tgttgagcgg tattcctgga gtcagcttaa aaagctgctt 360 gccgatacca gaaaatatca tggctacatg atggctaagg caccacatga tttcatgttt 420 gtgaagagga atgatccaga tggacctcat tcagacagaa tctattacct tgccatgtct 480 ggtgagaaca gagaaaatac actgttttat tctgaaattc ccaaaactat caatagagca 540 gcagtcttaa tgctctcttg gaagcctctt ttggatcttt ttcaggcaac actggactat 600 ggaatgtatt ctcgagaaga agaactatta agagaaagaa aacgcattgg aacagtcgga 660 attgcttctt acgattatca ccaaggaagt ggaacatttc tgtttcaagc cggtagtgga 720 atttatcacg taaaagatgg agggccacaa ggatttacgc aacaaccttt aaggcccaat 780 ctagtggaaa ctagttgtcc caacatacgg atggatccaa aattatgccc cgctgatcca 840 gactggattg cttttataca tagcaacgat atttggatat ctaacatcgt aaccagagaa 900 gaaaggagac tcacttatgt gcacaatgag ctagccaaca tggaagaaga tgccagatca 960 gctggagtcg ctacctttgt tctccaagaa gaatttgata gatattctgg ctattggtgg 1020 tgtccaaaag ctgaaacaac tcccagtggt ggtaaaattc ttagaattct atatgaagaa 1080 aatgatgaat ctgaggtgga aattattcat gttacatccc ctatgttgga aacaaggagg 1140 gcagattcat tccgttatcc taaaacaggt acagcaaatc ctaaagtcac ttttaagatg 1200 tcagaaataa tgattgatgc tgaaggaagg atcatagatg tcatagataa ggaactaatt 1260 caaccttttg agattctatt tgaaggagtt gaatatattg ccagagctgg atggactcct 1320 gagggaaaat atgcttggtc catcctacta gatcgctccc agactcgcct acagatagtg 1380 ttgatctcac ctgaattatt tatcccagta gaagatgatg ttatggaaag gcagagactc 1440 attgagtcag tgcctgattc tgtgacgcca ctaattatct atgaagaaac aacagacatc 1500 tggataaata tccatgacat ctttcatgtt tttccccaaa gtcacgaaga ggaaattgag 1560 tttatttttg cctctgaatg caaaacaggt ttccgtcatt tatacaaaat tacatctatt 1620 ttaaaggaaa gcaaatataa acgatccagt ggtgggctgc ctgctccaag tgatttcaag 1680 tgtcctatca aagaggagat agcaattacc agtggtgaat gggaagttct tggccggcat 1740 ggatctaata tccaagttga tgaagtcaga aggctggtat attttgaagg caccaaagac 1800 tcccctttag agcatcacct gtacgtagtc agttacgtaa atcctggaga ggtgacaagg 1860 ctgactgacc gtggctactc acattcttgc tgcatcagtc agcactgtga cttctttata 1920 agtaagtata gtaaccagaa gaatccacac tgtgtgtccc tttacaagct atcaagtcct 1980 gaagatgacc caacttgcaa aacaaaggaa ttttgggcca ccattttgga ttcagcaggt 2040 cctcttcctg actatactcc tccagaaatt ttctcttttg aaagtactac tggatttaca 2100 ttgtatggga tgctctacaa gcctcatgat ctacagcctg gaaagaaata tcctactgtg 2160 ctgttcatat atggtggtcc tcaggtgcag ttggtgaata atcggtttaa aggagtcaag 2220 tatttccgct tgaataccct agcctctcta ggttatgtgg ttgtagtgat agacaacagg 2280 ggatcctgtc accgagggct taaatttgaa ggcgccttta aatataaaat gggtcaaata 2340 gaaattgacg atcaggtgga aggactccaa tatctagctt ctcgatatga tttcattgac 2400 ttagatcgtg tgggcatcca cggctggtcc tatggaggat acctctccct gatggcatta 2460 atgcagaggt cagatatctt cagggttgct attgctgggg ccccagtcac tctgtggatc 2520 ttctatgata caggatacac ggaacgttat atgggtcacc ctgaccagaa tgaacagggc 2580 tattacttag gatctgtggc catgcaagca gaaaagttcc cctctgaacc aaatcgttta 2640 ctgctcttac atggtttcct ggatgagaat gtccattttg cacataccag tatattactg 2700 agttttttag tgagggctgg aaagccatat gatttacaga tctatcctca ggagagacac 2760 agcataagag ttcctgaatc gggagaacat tatgaactgc atcttttgca ctaccttcaa 2820 gaaaaccttg gatcacgtat tgctgctcta aaagtgatat aattttgacc tgtgtagaac 2880 tctctggtat acactggcta tttaaccaaa tgaggaggtt taatcaacag aaaacacaga 2940 attgatcatc acattttgat acctgccatg taacatctac tcctgaaaat aaatgtggtg 3000 ccatgcaggg gtctacggtt tgtggtagta atctaatacc ttaaccccac atgctcaaaa 3060 tcaaatgata catattcctg agagacccag caataccata agaattacta aaaaaaaaaa 3120 6 882 PRT Homo sapiens 6 Met Ala Ala Ala Met Glu Thr Glu Gln Leu Gly Val Glu Ile Phe Glu 1 5 10 15 Thr Ala Asp Cys Glu Glu Asn Ile Glu Ser Gln Asp Arg Pro Lys Leu 20 25 30 Glu Pro Phe Tyr Val Glu Arg Tyr Ser Trp Ser Gln Leu Lys Lys Leu 35 40 45 Leu Ala Asp Thr Arg Lys Tyr His Gly Tyr Met Met Ala Lys Ala Pro 50 55 60 His Asp Phe Met Phe Val Lys Arg Asn Asp Pro Asp Gly Pro His Ser 65 70 75 80 Asp Arg Ile Tyr Tyr Leu Ala Met Ser Gly Glu Asn Arg Glu Asn Thr 85 90 95 Leu Phe Tyr Ser Glu Ile Pro Lys Thr Ile Asn Arg Ala Ala Val Leu 100 105 110 Met Leu Ser Trp Lys Pro Leu Leu Asp Leu Phe Gln Ala Thr Leu Asp 115 120 125 Tyr Gly Met Tyr Ser Arg Glu Glu Glu Leu Leu Arg Glu Arg Lys Arg 130 135 140 Ile Gly Thr Val Gly Ile Ala Ser Tyr Asp Tyr His Gln Gly Ser Gly 145 150 155 160 Thr Phe Leu Phe Gln Ala Gly Ser Gly Ile Tyr His Val Lys Asp Gly 165 170 175 Gly Pro Gln Gly Phe Thr Gln Gln Pro Leu Arg Pro Asn Leu Val Glu 180 185 190 Thr Ser Cys Pro Asn Ile Arg Met Asp Pro Lys Leu Cys Pro Ala Asp 195 200 205 Pro Asp Trp Ile Ala Phe Ile His Ser Asn Asp Ile Trp Ile Ser Asn 210 215 220 Ile Val Thr Arg Glu Glu Arg Arg Leu Thr Tyr Val His Asn Glu Leu 225 230 235 240 Ala Asn Met Glu Glu Asp Ala Arg Ser Ala Gly Val Ala Thr Phe Val 245 250 255 Leu Gln Glu Glu Phe Asp Arg Tyr Ser Gly Tyr Trp Trp Cys Pro Lys 260 265 270 Ala Glu Thr Thr Pro Ser Gly Gly Lys Ile Leu Arg Ile Leu Tyr Glu 275 280 285 Glu Asn Asp Glu Ser Glu Val Glu Ile Ile His Val Thr Ser Pro Met 290 295 300 Leu Glu Thr Arg Arg Ala Asp Ser Phe Arg Tyr Pro Lys Thr Gly Thr 305 310 315 320 Ala Asn Pro Lys Val Thr Phe Lys Met Ser Glu Ile Met Ile Asp Ala 325 330 335 Glu Gly Arg Ile Ile Asp Val Ile Asp Lys Glu Leu Ile Gln Pro Phe 340 345 350 Glu Ile Leu Phe Glu Gly Val Glu Tyr Ile Ala Arg Ala Gly Trp Thr 355 360 365 Pro Glu Gly Lys Tyr Ala Trp Ser Ile Leu Leu Asp Arg Ser Gln Thr 370 375 380 Arg Leu Gln Ile Val Leu Ile Ser Pro Glu Leu Phe Ile Pro Val Glu 385 390 395 400 Asp Asp Val Met Glu Arg Gln Arg Leu Ile Glu Ser Val Pro Asp Ser 405 410 415 Val Thr Pro Leu Ile Ile Tyr Glu Glu Thr Thr Asp Ile Trp Ile Asn 420 425 430 Ile His Asp Ile Phe His Val Phe Pro Gln Ser His Glu Glu Glu Ile 435 440 445 Glu Phe Ile Phe Ala Ser Glu Cys Lys Thr Gly Phe Arg His Leu Tyr 450 455 460 Lys Ile Thr Ser Ile Leu Lys Glu Ser Lys Tyr Lys Arg Ser Ser Gly 465 470 475 480 Gly Leu Pro Ala Pro Ser Asp Phe Lys Cys Pro Ile Lys Glu Glu Ile 485 490 495 Ala Ile Thr Ser Gly Glu Trp Glu Val Leu Gly Arg His Gly Ser Asn 500 505 510 Ile Gln Val Asp Glu Val Arg Arg Leu Val Tyr Phe Glu Gly Thr Lys 515 520 525 Asp Ser Pro Leu Glu His His Leu Tyr Val Val Ser Tyr Val Asn Pro 530 535 540 Gly Glu Val Thr Arg Leu Thr Asp Arg Gly Tyr Ser His Ser Cys Cys 545 550 555 560 Ile Ser Gln His Cys Asp Phe Phe Ile Ser Lys Tyr Ser Asn Gln Lys 565 570 575 Asn Pro His Cys Val Ser Leu Tyr Lys Leu Ser Ser Pro Glu Asp Asp 580 585 590 Pro Thr Cys Lys Thr Lys Glu Phe Trp Ala Thr Ile Leu Asp Ser Ala 595 600 605 Gly Pro Leu Pro Asp Tyr Thr Pro Pro Glu Ile Phe Ser Phe Glu Ser 610 615 620 Thr Thr Gly Phe Thr Leu Tyr Gly Met Leu Tyr Lys Pro His Asp Leu 625 630 635 640 Gln Pro Gly Lys Lys Tyr Pro Thr Val Leu Phe Ile Tyr Gly Gly Pro 645 650 655 Gln Val Gln Leu Val Asn Asn Arg Phe Lys Gly Val Lys Tyr Phe Arg 660 665 670 Leu Asn Thr Leu Ala Ser Leu Gly Tyr Val Val Val Val Ile Asp Asn 675 680 685 Arg Gly Ser Cys His Arg Gly Leu Lys Phe Glu Gly Ala Phe Lys Tyr 690 695 700 Lys Met Gly Gln Ile Glu Ile Asp Asp Gln Val Glu Gly Leu Gln Tyr 705 710 715 720 Leu Ala Ser Arg Tyr Asp Phe Ile Asp Leu Asp Arg Val Gly Ile His 725 730 735 Gly Trp Ser Tyr Gly Gly Tyr Leu Ser Leu Met Ala Leu Met Gln Arg 740 745 750 Ser Asp Ile Phe Arg Val Ala Ile Ala Gly Ala Pro Val Thr Leu Trp 755 760 765 Ile Phe Tyr Asp Thr Gly Tyr Thr Glu Arg Tyr Met Gly His Pro Asp 770 775 780 Gln Asn Glu Gln Gly Tyr Tyr Leu Gly Ser Val Ala Met Gln Ala Glu 785 790 795 800 Lys Phe Pro Ser Glu Pro Asn Arg Leu Leu Leu Leu His Gly Phe Leu 805 810 815 Asp Glu Asn Val His Phe Ala His Thr Ser Ile Leu Leu Ser Phe Leu 820 825 830 Val Arg Ala Gly Lys Pro Tyr Asp Leu Gln Ile Tyr Pro Gln Glu Arg 835 840 845 His Ser Ile Arg Val Pro Glu Ser Gly Glu His Tyr Glu Leu His Leu 850 855 860 Leu His Tyr Leu Gln Glu Asn Leu Gly Ser Arg Ile Ala Ala Leu Lys 865 870 875 880 Val Ile 7 830 PRT Homo sapiens 7 Leu Arg Ser Ile Ile His Gly Ser Arg Lys Tyr Ser Gly Leu Ile Val 1 5 10 15 Asn Lys Ala Pro His Asp Phe Gln Phe Val Gln Lys Thr Asp Glu Ser 20 25 30 Gly Pro His Ser His Arg Leu Tyr Tyr Leu Gly Met Pro Tyr Gly Ser 35 40 45 Arg Glu Asn Ser Leu Leu Tyr Ser Glu Ile Pro Lys Lys Val Arg Lys 50 55 60 Glu Ala Leu Leu Leu Leu Ser Trp Lys Gln Met Leu Asp His Phe Gln 65 70 75 80 Ala Thr Pro His His Gly Val Tyr Ser Arg Glu Glu Glu Leu Leu Arg 85 90 95 Glu Arg Lys Arg Leu Gly Val Phe Gly Ile Thr Ser Tyr Asp Phe His 100 105 110 Ser Glu Ser Gly Leu Phe Leu Phe Gln Ala Ser Asn Ser Leu Phe His 115 120 125 Cys Arg Asp Gly Gly Lys Asn Gly Phe Met Val Ser Pro Met Lys Pro 130 135 140 Leu Glu Ile Lys Thr Gln Cys Ser Gly Pro Arg Met Asp Pro Lys Ile 145 150 155 160 Cys Pro Ala Asp Pro Ala Phe Phe Ser Phe Asn Asn Asn Ser Asp Leu 165 170 175 Trp Val Ala Asn Ile Glu Thr Gly Glu Glu Arg Arg Leu Thr Phe Cys 180 185 190 His Gln Gly Leu Ser Asn Val Leu Asp Asp Pro Lys Ser Ala Gly Val 195 200 205 Ala Thr Phe Val Ile Gln Glu Glu Phe Asp Arg Phe Thr Gly Tyr Trp 210 215 220 Trp Cys Pro Thr Ala Ser Trp Glu Gly Ser Gln Gly Leu Lys Thr Leu 225 230 235 240 Arg Ile Leu Tyr Glu Glu Val Asp Glu Ser Glu Val Glu Val Ile His 245 250 255 Val Pro Ser Pro Ala Leu Glu Glu Arg Lys Thr Asp Ser Tyr Arg Tyr 260 265 270 Pro Arg Thr Gly Ser Lys Asn Pro Lys Ile Ala Leu Lys Leu Ala Glu 275 280 285 Phe Gln Thr Asp Ser Gln Gly Lys Ile Val Ser Thr Gln Glu Lys Glu 290 295 300 Leu Val Gln Pro Phe Ser Ser Leu Phe Pro Lys Val Glu Tyr Ile Ala 305 310 315 320 Arg Ala Gly Trp Thr Arg Asp Gly Lys Tyr Ala Trp Ala Met Phe Leu 325 330 335 Asp Arg Pro Gln Gln Trp Leu Gln Leu Val Leu Leu Pro Pro Ala Leu 340 345 350 Phe Ile Pro Ser Thr Glu Asn Glu Glu Gln Arg Leu Ala Ser Ala Arg 355 360 365 Ala Val Pro Arg Asn Val Gln Pro Tyr Val Val Tyr Glu Glu Val Thr 370 375 380 Asn Val Trp Ile Asn Val His Asp Ile Phe Tyr Pro Phe Pro Gln Ser 385 390 395 400 Glu Gly Glu Asp Glu Leu Cys Phe Leu Arg Ala Asn Glu Cys Lys Thr 405 410 415 Gly Phe Cys His Leu Tyr Lys Val Thr Ala Val Leu Lys Ser Gln Gly 420 425 430 Tyr Asp Trp Ser Glu Pro Phe Ser Pro Gly Glu Asp Glu Phe Lys Cys 435 440 445 Pro Ile Lys Glu Glu Ile Ala Leu Thr Ser Gly Glu Trp Glu Val Leu 450 455 460 Ala Arg His Gly Ser Lys Ile Trp Val Asn Glu Glu Thr Lys Leu Val 465 470 475 480 Tyr Phe Gln Gly Thr Lys Asp Thr Pro Leu Glu His His Leu Tyr Val 485 490 495 Val Ser Tyr Glu Ala Ala Gly Glu Ile Val Arg Leu Thr Thr Pro Gly 500 505 510 Phe Ser His Ser Cys Ser Met Ser Gln Asn Phe Asp Met Phe Val Ser 515 520 525 His Tyr Ser Ser Val Ser Thr Pro Pro Cys Val His Val Tyr Lys Leu 530 535 540 Ser Gly Pro Asp Asp Asp Pro Leu His Lys Gln Pro Arg Phe Trp Ala 545 550 555 560 Ser Met Met Glu Ala Ala Ser Cys Pro Pro Asp Tyr Val Pro Pro Glu 565 570 575 Ile Phe His Phe His Thr Arg Ser Asp Val Arg Leu Tyr Gly Met Ile 580 585 590 Tyr Lys Pro His Ala Leu Gln Pro Gly Lys Lys His Pro Thr Val Leu 595 600 605 Phe Val Tyr Gly Gly Pro Gln Val Gln Leu Val Asn Asn Ser Phe Lys 610 615 620 Gly Ile Lys Tyr Leu Arg Leu Asn Thr Leu Ala Ser Leu Gly Tyr Ala 625 630 635 640 Val Val Val Ile Asp Gly Arg Gly Ser Cys Gln Arg Gly Leu Arg Phe 645 650 655 Glu Gly Ala Leu Lys Asn Gln Met Gly Gln Val Glu Ile Glu Asp Gln 660 665 670 Val Glu Gly Leu Gln Phe Val Ala Glu Lys Tyr Gly Phe Ile Asp Leu 675 680 685 Ser Arg Val Ala Ile His Gly Trp Ser Tyr Gly Gly Phe Leu Ser Leu 690 695 700 Met Gly Leu Ile His Lys Pro Gln Val Phe Lys Val Ala Ile Ala Gly 705 710 715 720 Ala Pro Val Thr Val Trp Met Ala Tyr Asp Thr Gly Tyr Thr Glu Arg 725 730 735 Tyr Met Asp Val Pro Glu Asn Asn Gln His Gly Tyr Glu Ala Gly Ser 740 745 750 Val Ala Leu His Val Glu Lys Leu Pro Asn Glu Pro Asn Arg Leu Leu 755 760 765 Ile Leu His Gly Phe Leu Asp Glu Asn Val His Phe Phe His Thr Asn 770 775 780 Phe Leu Val Ser Gln Leu Ile Arg Ala Gly Lys Pro Tyr Gln Leu Gln 785 790 795 800 Ile Tyr Pro Asn Glu Arg His Ser Ile Arg Cys Pro Glu Ser Gly Glu 805 810 815 His Tyr Glu Val Thr Leu Leu His Phe Leu Gln Glu Tyr Leu 820 825 830 8 2495 DNA Homo sapiens 8 ctccggagca tcatccacgg cagccgcaag tactcgggcc tcattgtcaa caaggcgccc 60 cacgacttcc agtttgtgca gaagacggat gagtctgggc cccactccca ccgcctctac 120 tacctgggaa tgccatatgg cagccgggag aactccctcc tctactctga gattcccaag 180 aaggtccgga aagaggctct gctgctcctg tcctggaagc agatgctgga tcatttccag 240 gccacgcccc accatggggt ctactctcgg gaggaggagc tgctgaggga gcggaaacgc 300 ctgggggtct tcggcatcac ctcctacgac ttccacagcg agagtggcct cttcctcttc 360 caggccagca acagcctctt ccactgccgc gacggcggca agaacggctt catggtgtcc 420 cctatgaaac cgctggaaat caagacccag tgctcagggc cccggatgga ccccaaaatc 480 tgccctgccg accctgcctt cttctccttc aacaataaca gcgacctgtg ggtggccaac 540 atcgagacag gcgaggagcg gcggctgacc ttctgccacc aaggtttatc caatgtcctg 600 gatgacccca agtctgcggg tgtggccacc ttcgtcatac aggaagagtt cgaccgcttc 660 actgggtact ggtggtgccc cacagcctcc tgggaaggtt cagagggcct caagacgctg 720 cgaatcctgt atgaggaagt cgatgagtcc gaggtggagg tcattcacgt cccctctcct 780 gcgctagaag aaaggaagac ggactcgtat cggtacccca ggacaggcag caagaatccc 840 aagattgcct tgaaactggc tgagttccag actgacagcc agggcaagat cgtctcgacc 900 caggagaagg agctggtgca gcccttcagc tcgctgttcc cgaaggtgga gtacatcgcc 960 agggccgggt ggacccggga tggcaaatac gcctgggcca tgttcctgga ccggccccag 1020 cagtggctcc agctcgtcct cctccccccg gccctgttca tcccgagcac agagaatgag 1080 gagcagcggc tagcctctgc cagagctgtc cccaggaatg tccagccgta tgtggtgtac 1140 gaggaggtca ccaacgtctg gatcaatgtt catgacatct tctatccctt cccccaatca 1200 gagggagagg acgagctctg ctttctccgc gccaatgaat gcaagaccgg cttctgccat 1260 ttgtacaaag tcaccgccgt tttaaaatcc cagggctacg attggagtga gcccttcagc 1320 cccggggaag atgaatttaa gtgccccatt aaggaagaga ttgctctgac cagcggtgaa 1380 tgggaggttt tggcgaggca cggctccaag atctgggtca atgaggagac caagctggtg 1440 tacttccagg gcaccaagga cacgccgctg gagcaccacc tctacgtggt cagctatgag 1500 gcggccggcg agatcgtacg cctcaccacg cccggcttct cccatagctg ctccatgagc 1560 cagaacttcg acatgttcgt cagccactac agcagcgtga gcacgccgcc ctgcgtgcac 1620 gtctacaagc tgagcggccc cgacgacgac cccctgcaca agcagccccg cttctgggct 1680 agcatgatgg aggcagccag ctgccccccg gattatgttc ctccagagat cttccatttc 1740 cacacgcgct cggatgtgcg gctctacggc atgatctaca agccccacgc cttgcagcca 1800 gggaagaagc accccaccgt cctctttgta tatggaggcc cccaggtgca gctggtgaat 1860 aactccttca aaggcatcaa gtacttgcgg ctcaacacac tggcctccct gggctacgcc 1920 gtggttgtga ttgacggcag gggctcctgt cagcgagggc ttcggttcga aggggccctg 1980 aaaaaccaaa tgggccaggt ggagatcgag gaccaggtgg agggcctgca gttcgtggcc 2040 gagaagtatg gcttcatcga cctgagccga gttgccatcc atggctggtc ctacgggggc 2100 ttcctctcgc tcatggggct aatccacaag ccccaggtgt tcaaggtggc catcgcgggt 2160 gccccggtca ccgtctggat ggcctacgac acagggtaca ctgagcgcta catggacgtc 2220 cctgagaaca accagcacgg ctatgaggcg ggttccgtgg ccctgcacgt ggagaagctg 2280 cccaatgagc ccaaccgctt gcttatcctc cacggcttcc tggacgaaaa cgtgcacttt 2340 ttccacacaa acttcctcgt ctcccaactg atccgagcag ggaaacctta ccagctccag 2400 atctacccca acgagagaca cagtattcgc tgccccgagt cgggcgagca ctatgaagtc 2460 acgttactgc actttctaca ggaatacctc tgagc 2495 

1. A peptide which comprises: (a) the sequence shown in SEQ ID NO:2; or (b) the amino acid sequences: His⁸³³GlyTrpSerTyrGlyGlyPheLeu; Leu⁹¹³AspGluAsnValHisPhePhe; Glu⁹⁴⁴ArgHisSerIleArg and Phe³⁵⁰ValIleGlnGluGluPhe, and which has the substrate specificity of the sequence shown in SEQ ID NO:2; or (c) the sequence which has at least 60% identity with the sequence shown in SEQ ID NO:2, and which has the substrate specificity of the sequence shown in SEQ ID NO:2; or (d) the sequence shown in SEQ ID NO:4.
 2. A peptide according to claim 1 (c), wherein the amino acid identity is at least 75%.
 3. A peptide according to claim 1 (c) wherein the amino acid identity is at least 95%.
 4. A fragment of the sequence shown in SEQ ID NO:2 which has the substrate specificity of the sequence shown in SEQ ID NO:2.
 5. A fragment according to claim 4 which comprises part of the sequence shown in SEQ ID NO:2.
 6. A fusion protein comprising the amino acid sequence shown in SEQ ID NO:2 linked with a further amino acid sequence, the fusion protein having the substrate specificity of the sequence shown in SEQ ID NO:2.
 7. A fusion protein according to claim 6 wherein the further amino acid sequence is selected from the group consisting of GST, V5 epitope and His tag.
 8. A method of identifying a molecule capable of inhibiting cleavage of a substrate by DPP9 comprising the following steps: (a) contacting DPP9 with the molecule; (b) contacting DPP9 of step (a) with a substrate capable of being cleaved by DPP9, in conditions sufficient for cleavage of the substrate by DPP9; and (c) detecting substrate not cleaved by DPP9, to identify that the molecule is capable of inhibiting cleavage of the substrate by DPP9.
 9. A method of identifying a molecule capable of inhibiting specifically, the cleavage of a substrate by DPP9, the method comprising the following steps: (a) contacting DPP9 and a further protease with the molecule; (b) contacting DPP9 and the further protease of step (a) with a substrate capable of being cleaved by DPP9 and the further protease, in conditions sufficient for cleavage of the substrate by DPP9 and the further protease; and (c) detecting substrate not cleaved by DPP9, but cleaved by the further protease, to identify that the molecule is capable of inhibiting specifically, the cleavage of the substrate by DPP9.
 10. A method of reducing or inhibiting the catalytic activity of DPP9, the method comprising the step of contacting DPP9 with an inhibitor of DPP9 catalytic activity.
 11. A method of cleaving a substrate comprising the step of contacting the substrate with DPP9 in conditions sufficient for cleavage of the substrate by DPP9.
 12. A nucleic acid molecule which: (a) encodes the sequence shown in SEQ ID NO:2; or (b) consists of the sequence shown in SEQ ID NO:1; or (c) is capable of hybridizing to a nucleic acid molecule consisting of the sequence shown in SEQ ID NO:1 in stringent conditions, and which encodes a peptide which has the substrate specificity of the sequence shown in SEQ ID NO:2; or (d) consists of the sequence shown in SEQ ID NO:3.
 13. A nucleic acid molecule according to claim 12 (c) wherein the molecule is capable of hybridising in high stringent conditions.
 14. A nucleic acid molecule according to claim 12 which is capable of hybridising to a gene which is located at band p13.3 on human chromosome
 19. 15. A nucleic acid molecule according to claim 12 which does not contain 5′ or 3′ untranslated regions.
 16. A fragment of a nucleic acid molecule consisting of the sequence shown in SEQ ID NO:1, which encodes a peptide which has the substrate specificity of the sequence shown in SEQ ID NO:2.
 17. A fragment according to claim 16 which consists of part of the sequence shown in SEQ ID NO:1.
 18. A vector comprising a nucleic acid molecule according to claim
 12. 19. A cell comprising a vector according to claim
 18. 20. A composition comprising a peptide according to claim
 1. 21. An antibody which is capable of binding to a peptide according to claim
 1. 22. An antibody according to claim 21 which is produced by a hybridoma cell.
 23. A hybridoma cell capable of making an antibody according to claim
 22. 24. A peptide comprising the sequence shown in SEQ ID NO:
 7. 25. A nucleic acid molecule comprising the sequence shown in SEQ ID NO:8. 